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Don’t hide the decline 


US scientists should not be placated by the ‘flat budget’ myth. Funds are decreasing, and the 


situation will get worse. 


request can bea time of both hope and trepidation. But after last 
year’s fiscal battles with Congress, complete with an embarrass- 
ing government shutdown and painful across-the-board spending 
cuts, it was always clear that this year there would be little to celebrate. 

In that atmosphere, the unveiling on 4 March of President Barack 
Obama's US$3.9-trillion budgetary vision for fiscal year 2015 brought 
both disappointment and a sigh of relief. In one sense, the proposal 
was optimistic: it exceeded congressional spending limits by $56 bil- 
lion, and there were few deep cuts for science. But it leaves the budgets 
of major scientific funders, such as the US National Institutes of Health 
(NIH), the National Science Foundation (NSF) and the research 
efforts at the Department of Energy, essentially flat (see page 147). 

Amid a sluggish economy and zealous calls to tighten federal purse 
strings, the prevailing wisdom is often to be grateful for a flat budget. 
Things could be worse. But those projects that stand to be gutted — 
suchas the Stratospheric Observatory for Infrared Astronomy (SOFIA), 
an airborne observatory funded largely by NASA, which would have 
its budget slashed from $84 million to $12 million — stand as painful 
reminders that a flat budget is not something to celebrate. The proposed 
$200-million boost to bring the NIH’s budget to $30.2 billion is paltry, 
but even worse is the $1.3-billion cut that could be in store for the 
Department of Health and Human Services, the NIH’s parent agency. 

What is more, inflation does not stand still for flat budgets. Over- 
all spending on research and development would increase by 1.2% 
in 2015 if Obama has his way. But the rate of inflation that year is 
expected to be 1.7%. The outlook is worse for biomedical research — 
here, inflation is projected to rise by 2.2% in 2015, according to the 
Department of Health and Human Services’ Biomedical Research and 
Development Price Index. The 0.7% budgetary bump that Obama has 
requested will not keep pace. 

Indeed, ‘flat’ budgets such as those proposed last week have steadily 
eroded the NIH’s coffers over the past decade. Controlling for infla- 
tion, the NIH’s budget shrank by 10% between 2004 and 2014, accord- 
ing to the American Association for the Advancement of Science in 
Washington DC. The real decline is even steeper when the rate of 
biomedical inflation is taken into account. 

A similar trend is emerging for research and development overall: 
federal spending on research and development in 2014 is 15.8% lower 
than in 2010 when inflation is considered. 

Greener pastures are nowhere in sight. The president's request was 
sent to Congress, which will produce a plan of its own. Included in 
Obama's request is a proposed $56-billion Opportunity, Growth, and 
Security Initiative that would add $5.3 billion to the nation’s research 
and development coffers. But there is little reason to hope that the 
initiative will make it through a US Congress determined to rein in 
spending, opposed to raising taxes and not generally known for a 
willingness to compromise. These are, after all, the same legislators 


f or US researchers, the annual unveiling of the presidential budget 


who in October shut down the government for 16 days and allowed 
across-the-board spending cuts of 5% last year. Science suffered as 
a result: the NSF awarded 690 fewer grants in 2013 than the previ- 
ous year, according to figures released last week by the Government 
Accountability Office. The NIH cut its grants by 750. The White 

House’s budget proposal makes it clear: 


“Rather thana there will be no compensation for these lost 
relief, apparently opportunities. 

flat budgets are Meanwhile, the economic strain on the 
asure sign that country is immense. Mandatory spending 
competition for obligations — on retirement and health- 
funds will Zi TOW care programmes, for example — are soaring, 


still further.” 


squeezing discretionary spending on other 
worthy areas, including research. As a result, 
discretionary programmes are battling over slices of a rapidly shrink- 
ing pie: in 2010, discretionary funds were 39% of the budget; in 2015, 
they will be 30%. 

This means that the fight will only be more intense in years to come. 
Rather than a relief, apparently flat budgets are a sure sign that compe- 
tition for funds will grow still further. And that things will get worse 
before they get better. = 


An elegant chaos 


Universal theories are few and far between in 
ecology, but that is what makes it fascinating. 


straightforward. Many of the organisms live at a very human 

scale and are easy to access, especially in community ecology. 
Ecologists do not need special equipment to see and count elk. There 
are no electron microscopes, space telescopes or drilling rigs that can 
go wrong. Easy. 

And yet, ecologists know that their subject can prove as troublesome 
as any other. Ecology would be easy, were it not for all the ecosystems — 
vastly complex and variable as they are. Even the most austere desert or 
apparently featureless moor is a dense, intricate network of thousands of 
species of photosynthesizers, predators, prey animals, parasites, detrito- 
vores and decomposers. As naturalist E. O. Wilson put it: “A lifetime can 
be spent in a Magellanic voyage around the trunk ofa single tree” And 
not all of what one might learn from sucha voyage would be transferable 
to the next tree. History, chance, climate, geology and — increasingly 
— human fiddling mean that no two ecosystems work in the same way. 

Scientists like to impose structure and order on chaos, and ecologists 


r Yo some scientists in other fields, ecology must seem relatively 
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are no different. Ecology has its grand theories, but they are riddled 
with conditional clauses, caveats and exceptions. There are clear pat- 
terns at the global and single-species scales, but the middle ground 
is, as biologist John Lawton affectionately put it in 1999, “a mess” It 
is doubtful that the generalities that underlie the complex patterns 
of nature will ever be phrased succinctly enough to fit on a T-shirt. 

This complexity is demonstrated by work that questions a famous 
and elegant ‘trophic cascade’ in Yellowstone National Park, Wyoming, 
discussed on page 158. The theory goes that wolves, restored to the park 
in the 1990s after decades of absence, scare elk away from certain areas. 
That has a knock-on effect for the rest of the food chain, allowing aspen 
and willows to flourish after decades of being browsed nearly to death. 
But studies in recent years suggest that wolves alone do not control the 
ecosystem. Other factors — the presence of beaver dams and grizzly 
bears, weather, hunting by humans and even climate change — also 
affect the elk population and the growth of trees and shrubs. 

It would be useful to have broad patterns and commonalities in 
ecology. To know how ecosystems will respond to climate change, or 
to be able to predict the consequences of introducing or reintroducing 
a species, would make conservation more effective and efficient. But a 
unified theory of everything is not the only way to gain insight. 

More ecologists should embrace the non-predictive side of their 
science. Teasing out what is going on in complex systems by looking 
at how ecosystems evolved, and by manipulating the environment in 
experiments, is just as much a science as creating formulae for how 
ecosystems work. 

Paradigm shifts, after all, are rare in ecology. Debates are often 
resolved when competing concepts combine, rather than when one 
pushes the other completely off the table. Take the contrasting ideas 
of top-down regulation of ecosystems by carnivores and bottom-up 
regulation effected by the nutrition available from plants. The field is 


slowly working towards an integrated theory to predict when the top 
will rule and when the bottom will be in charge — and that theory will 
take the time to consider the middle players, the herbivores. 

Other ecological debates have followed similar path. Disagreement 
over whether complex ecosystems are more or less stable than simpler 
ones, for example, is also settling to a consensus: it depends. 

Useful practical predictions need not stem from universal laws. They 
may come instead from a deep knowledge of the unique workings of 
each ecosystem — knowledge gained from observation and analysis. 

Proposing sweeping theories is exciting, but 


“If ecosys tems if ecologists want to produce work useful to 
all worked in conservation, they might do better to spend 
the same way, their days sitting quietly in ecosystems with 
they would lose waterproof notebooks and hand lenses, writ- 
much of their ing everything down. 

mystery, their Ecological complexity, which may seem 
surprise and like an impenetrable thicket of nuance, is 


also the source of much of our pleasure in 
nature. If ecosystems were simple puzzles 
that all worked in the same way, they would lose much of their mys- 
tery, their surprise and their beauty. A lot of conservation work aims to 
protect the complexity and variability that makes ecosystems so hard 
to understand, and indeed to conserve. 

Ecological rules are not the only reasons to promote conservation 
and fight extinctions. Sometimes we can argue for the conservation of 
particular species because ecology provides a scientific basis for it. At 
other times, we make the argument because there is a good chance that 
ecology will soon catch up and explain why the species are important. 

But even if some predators do little but sit at the top of their food 
pyramids, creaming off a few herbivores, would we really want to live 
in a world without them? Answering that question really is easy. m 


their beauty.” 


Share alike 


Research communities need to agree on 
standard etiquette for data-sharing. 


peak after a hard climb, only to see the true summit still above. 
Scientists who take on the tough terrain of open access may have 
a similar experience. After they reach the notable goal of sharing their 
research papers, they discover that a higher summit awaits: open data. 

In many fields, making research data available online for all is a step 
beyond making research papers open-access. This might puzzle com- 
munities that have already agreed to share. Biologists routinely upload 
DNA sequences to the public repository GenBank, for example, creating 
a scientific commons for everyone's benefit. There are now more than 
600 subject-specific repositories, with community-specific standards. 

Yet even some of the most strident open-access supporters baulk at 
the concept of fully open data, judging by the reaction to a strength- 
ened data-sharing policy instituted by the Public Library of Science 
(PLOS) this month. PLOS now requires researchers to make their 
papers underlying data open online on publication, apart from data 
that they have a duty to keep private, such as that on human study 
participants (go.nature.com/rd27aa). Journals such as Molecular 
Ecology have mandated the same thing for years. But the PLOS move 
has provoked heated discussion and highlighted some important, yet 
unsettled, aspects of the practice and ethics of online data-sharing. 

A few years ago, a survey found that scientists cited a lack of time and 
money, as well as technical barriers, to explain why they did not post data 
online (C. Tenopir et al. PLoS ONE 6, e21101; 2011). It still takes time to 
prepare data, but increasingly, other excuses do not fly. General-purpose 


ek mountaineer knows the sinking feeling of reaching a 
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storage sites such as Dryad and figshare are cheap (or free) and suit- 
able for all kinds of data sets; data journals provide publication venues 
appealing to the traditionally minded; and standards are emerging for 
citing other people's data sets (see Nature 500, 243-245; 2013). 

Harder to surmountare the feeling of data ownership and the fear 
of being ‘scooped. Years of toil to collect a data set that might support 
a decade of career-making publications could be rendered moot when 
another researcher jumps on the information online. This is a particular 
problem for early-career researchers, and for those working with unique 
data sets in small ecology or environmental-science laboratories. 

Behind this fear is the worry that other scientists will not provide 
credit for the data they use. Research administrators place such impor- 
tance on paper authorship that it is probably not enough for a study 
that leans significantly on another researcher's hard-won data set to 
merely cite that researcher, perhaps depriving them ofa publication. 

Communities need to debate the ethics of data-sharing and agree 
on etiquette. When a researcher relies on another’s data, for example, 
it should be standard practice to invite the data-providers to be co- 
authors. Ecologists Clifford Duke and John Porter have suggested 
guidelines for deciding whether to extend such an invitation (C. S. Duke 
and J. H. Porter BioScience 63, 483-489; 2013); these include noting 
whether the data are integral to the new analysis, whether the data 
are unique or particularly novel, and whether the data-provider can 
fully participate in manuscript-writing by approving draft and final 
versions. Another ecologist, Dominique Roche, has urged disclosure 
of data reuse, and better communication between data generators and 
reusers (D. G. Roche et al. PLoS Biol. 12, €1001779; 2014). 

It is not clear whether widespread online data-sharing will increase 
uncredited scooping. For now, Nature mandates 
uploading data when structured community data 
repositories exist, and encourages it otherwise. 
Before you can climb the highest mountains, you 
need proper safeguards and a decent map. m 
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practical science in schools. A consultation just closed by the 

Office of Qualifications and Examinations Regulation suggests 
that practical science and laboratory work should no longer contribute 
to the final mark for the A-level examination that students take at 18. 

The move is especially odd given that other nations — Britain's com- 
petitors — are waking up to the need to include more practical science 
in their education systems. And British scientists are helping them to 
do it. 1am one of them. 

These countries — China, Poland and Ireland among them — realize 
that practical work is not just an integral part of science and essential 
to understand how science works, it is the best route to give students 
the skills they will need to support technological innovation. China, 
especially, has ambitious plans here: officials are 
working to change the culture of its school system 
so that it recognizes and rewards practical skills. 

Practical science is more than hands-on sci- 
ence. It challenges the student to understand the 
real world, to create ways to test that understand- 
ing and to grasp the significance of statistics and 
errors in their arguments. 

I am an astronomer, and my subject has a 
major advantage when it comes to practical and 
hands-on experience. We can automate and offer 
it remotely. At a stroke, this solves one of the 
obstacles to practical science in schools across 
the world: that lab work is expensive and requires 
skilled teachers and laboratory technicians, 
which are in short supply. Practical astronomy is 
easier — given the right equipment. 

The Universe travels over our heads every night, 
and the only requirement for practical work is a telescope. In the late 
1980s, the UK astronomical community, tired of the tedious need to 
guide these large instruments by eye, decided to investigate robotic 
telescopes. I was awarded a research contract to prove the concept ofa 
telescope that could work autonomously. 

The result was the Bradford Robotic Telescope (BRT). Initially 
perched high in the Yorkshire Pennines, it was the world’s first fully 
automated instrument. Users submitted a list of objects they wished 
to observe and waited for the results to be returned to them by e-mail. 
Astronomers had priority, but the early years of the Internet allowed us 
to open up its use to thousands of others. We gave them free access to 
the instrument when the astronomers were not using it. 

The telescope was then transferred to Mount Teide in the Canary 
Islands. It does everything for users: evaluates the 


Pee has a bizarre plan to downgrade the importance of 


weather, schedules itself to optimize observations, NATURE.COM 
takes calibration data and returns the whole pack- _ Discuss this article 
age along with analysis programmes. Thisisnow _ online at: 

an option on nearly all large telescopes, but for _go.lature.com/esqg9h 
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INNOVATION 


AND CREATIVITY 
ARE THE 


BEDROCK 
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KNOWLEDGE 


ECONOMY. 


Practical science has a 
_ global reach and appeal 


As English schools consider downgrading practical science, John Baruch 
points out that other nations are rushing to include more. 


members of the public it remains unusual. (The norm is for users to 
have a fixed time slot of half an hour or so to drive the telescope, using 
web cameras at the observatory to see whether the weather is suitable 
and to move the telescope to point at the object they want to observe.) 

Around 90,000 students and 2,500 teachers in Britain use the BRT. 
Secondary schools pay £195 (US$326) a year; primary schools £70. 
Every child has a username and can log on from home. More than one- 
third do so, and the results are stunning: children race back to school 
the next day to tell their teachers what they found. 

In many schools, the telescope forms part of the GCSE astronomy 
programme (taken at around age 15), and astrophysics modules of the 
physics A level — one of the subjects for which officials are now trying 
to downgrade practical experience. 

Britain's loss could be China's gain. China has 
traditionally shown little interest in practical sci- 
ence in schools. Practical work does not feature in 
the school-leaving, or Gaokao, examination — the 
most important exam taken by Chinese young 
people (when they are 17) — so students, parents 
and teachers have never taken it very seriously. 

That is now changing. A pilot programme run 
in the Beijing region and led by the Chen Jing Lun 
school will see practical-science projects contrib- 
ute. The BRT will be the lead project offered to 
the students, starting this spring term. The British 
Council — which promotes international educa- 
tional opportunities and cultural relations — has 
helped us to translate our website into Chinese. 

Assuming that the pilot succeeds, there are 
already plans to expand it. We are talking to the 
Chinese Academy of Sciences about how it could 
build its own robotic telescopes. The Beijing Planetarium, Tsinghua 
University and the South China University of Technology in Guangzhou 
are already working on ways to give all Chinese students access to them. 

There is interest elsewhere, too. Ireland ran a very successful pilot 
programme last year through University College Dublin and is now 
looking at how to roll it out across all secondary schools in Ireland. A 
pilot in Opole, Poland, organized through the University of Warsaw's 
physics department has the same objectives. All these international 
developments are driven by an aspiration to build technology-driven 
knowledge economies with high-paying employment. 

The skills of innovation and creativity developed with practical sci- 
ence are the bedrock of a knowledge economy, and the Chinese and oth- 
ers using UK technology to boost their competitiveness must be looking 
at England’s plans to drop practical science at A level as rather strange. 


John Baruch is a senior lecturer at the University of Bradford and 
visiting professor at South China University of Technology in Guangzhou. 
e-mail: john@telescope.org 
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Catalyst eases 
fuel production 


A catalyst could improve the 
manufacture of methanol, a 
promising fuel for renewable 
energy, from carbon dioxide. 
Current methods require 
high pressures or generate 
carbon monoxide, an 
undesirable by-product. 

Jens Norskov at Stanford 
University in California and 
his colleagues modelled the 
chemical reduction of CO, to 
methanol at ambient pressure 
and identified nickel-gallium- 
based compounds as promising 
catalysts. 

The researchers synthesized 
and tested a series of these cat- 
alysts, and found that Ni,Ga, 
produced the same or larger 
amounts of methanol com- 
pared with conventional cata- 
lysts, while also generating less 
CO, all at ambient pressure. 

This catalyst could be used 
to make methanol asa fuel 
in, for instance, fuel cells, the 
authors say. 

Nature Chem. http://doi.org/rss 
(2014) 


How the fish got 
its fins 


The adipose fin, which 
sits between the dorsal 
fin and the tail on many 
fishes, might have evolved 
separately in different fish 
lineages rather than once 
from a single ancestor. 
This suggests that the fin 
(pictured with arrow) has 
an adaptive purpose and 
can evolve into various 
forms, contrary to previous 
thinking. 

Thomas Stewart 
at the University of 
Chicago, Illinois, and his 
colleagues reconstructed 
the evolutionary 
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Fast imaging captures falling droplets 


Researchers have obtained images of tiny oil 
droplets (pictured) forming in flight at high 
speeds. Such data on droplet formation could 
improve inkjet printing techniques. 

Studying how droplets behave in inkjet 
printing is difficult because they move so 
rapidly. Detlef Lohse at the University of 
Twente in the Netherlands and his colleagues 
used 8-nanosecond-long flashes of a laser to 


light up picolitre-sized silicone oil droplets, 
and recorded these with a microscope and 
high-speed camera. By comparing two images 
of the same droplets taken 600 nanoseconds 
apart, the authors calculated the internal 

flow rate of the droplets as they formed, and 
found good agreement with mathematical 
simulations. 

Phys. Rev. Applied 1,014004 (2014) 


relationships of 232 fishes, 
looking at the presence 

and absence of adipose 
fins. They also studied the 
skeletons of 620 fish species 
from 55 families. The 

team concludes that these 
fins have a wide variety of 
skeletal structures and have 
repeatedly evolved some of 
the same features, such as 
fin rays — rods of bone or 
cartilage that 

support the 


© 2014 Mac 


fin membrane. 

Although the purpose 
of these fins is not clear, 
they could be a powerful 
tool for studying vertebrate 
limb evolution, the authors 
suggest. 
Proc. R. Soc. B 281, 20133120 
(2014) 


ARCHAEOLOGY 


Cats tamed early 
in Egypt 


Ancient Egyptians might 
have domesticated 
wild cats nearly 2,000 

years earlier than 
previously thought. 

Egyptian artwork from 

4,000 years ago depicts 
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domesticated cats alongside 
humans. But in 2008, Wim 
Van Neer at the Royal 
Belgian Institute of Natural 
Sciences in Brussels and 

his colleagues discovered 
six cat skeletons buried 

ina cemetery for elite 
Egyptians that dates to the 
fourth millennium Bc. The 
teeth and bones resemble 
those of modern domestic 
felines. The cats — two 
pairs of kittens, and an older 
female and male — seem to 
have been born outside the 
breeding season of wild cats, 
suggesting that humans had 
a role in rearing them, the 
researchers say. 

J. Arch. Sci. http://doi.org/rsg 
(2014) 
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CONSERVATION BIOLOGY 


Extinction looms 
for many mammals 


Nearly one-quarter of the 
world’s carnivores and hoofed 
mammals have moved closer 
to extinction since the 1970s. 
Moreno Di Marco at 
Sapienza University of Rome 
and his colleagues looked at 
the conservation statuses of 
about 500 species of carnivores 
and ungulates over the past 
40 years. The researchers 
found that for every species 
that saw improvements in 
status, eight deteriorated. 
Large animals are also sliding 
towards extinction faster than 
their smaller counterparts, 
and the sharpest declines in 
conservation status were seen 
in southeast Asian species. 
The authors attribute 
these shifts to factors such as 
changes in international trade 
regulations, hunting, habitat 
loss and geopolitical events 
such as the collapse of the 
Soviet Union, which resulted 
in the loss of protected areas. 
Conserv. Biol. http://doi.org/rrs 
(2014) 


ECOLOGY 


Warmer climate 
disturbs food web 


A study of seabird feathers 
has revealed how climate 
change is shifting the food 
web in the Indian Ocean. 

Alexander Bond at the 
University of Saskatchewan 
in Saskatoon, Canada, 
and Jennifer Lavers at the 
University of Tasmania in 
Hobart, Australia, inferred 
the diets of flesh-footed 
shearwaters (Puffinus 
carneipes) by looking at 
ratios of carbon and nitrogen 
isotopes in the birds’ feathers 
that were collected between 
1936 and 2011. 

The duo found that 
levels of heavy isotopes — 
which are present at higher 
concentrations in species 
further up the food chain 
— fell in shearwater feathers 
over the years, hinting 
that the birds are eating 


animals that are lower on 
the food chain. This could 
be due to a lack of large 

fish caused by fishing. 
Furthermore, the length of 
the shearwaters’ food chain 
could be shortening because 
of reduced nutrient flow to 
the Indian Ocean, owing 

to a warming climate that 

is weakening the Leeuwin 
Current near the western 
coast of Australia. 

Glob. Change Biol. http://doi.org/ 
rrp (2014) 


High cholesterol in 
prostate tumours 


Prostate cancer could one 
day be treated by altering 

the cancer cells’ abnormal 
cholesterol metabolism. 

Ji-Xin Cheng at Purdue 
University in West 
Lafayette, Indiana, and his 
colleagues used Raman 
spectromicroscopy to analyse 
lipids inside single cells in 
tissue samples from people 
with prostate cancer. The 
team found that a cholesterol 
derivative, cholesteryl ester, 
accumulates inside the most 
aggressive of the cancer cells, 
but not in normal prostate 
cells. This build-up occurs 
because of the loss of PTEN, a 
tumour suppressor linked to 
many cancers. 

Treating tumour-bearing 
mice with small molecules 
that block the accumulation 
of cholesteryl ester shrank 
the tumours and slowed their 
growth. 

Cell Metabol. 19, 393-406 (2014) 


Hot air guides 
laser beams 


A channel of hot air could 
enable high-power laser 
beams to travel through 
the atmosphere over long 
distances — which might be 
useful for applications such as 
communications. 
High-power beams cannot 
be precisely focused over 
many kilometres, because 
the surrounding air absorbs 
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Why warm caresses feel so good 
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Nerve fibres in human skin that are 
sensitive to gentle touch are specially 
tuned to respond to slow, skin- 


temperature strokes. 

Rochelle Ackerley at the University of Gothenburg, 
Sweden, and her colleagues used a robotic probe to stroke the 
forearms of volunteers at different speeds and temperatures. 
In one experiment, the researchers recorded the electrical 
responses of the nerves, called C-tactile fibres, in the skin of 
18 participants. In another, they assessed how pleasurable 
30 different participants considered each stroke. 

The C-tactile fibres fired more frequently, and participants 
reported more pleasure, when strokes were applied slowly and 
the probe was close to typical skin temperature. The findings 
suggest that the fibres have a role in evolutionarily important 
social interactions that rely on touch, such as in romantic 
relationships or when nurturing a baby. 


J. Neuro. 34, 2879 -2883 (2014) 


and distorts the radiation. 
To solve this problem, 
Howard Milchberg and his 
colleagues at the University 
of Maryland in College 
Park created a conduit in 
air for the laser. They fired 
a square-shaped array of 
four intense, low-power 
light bursts. The dissipating 
shots left a channel of hot, 
dense air, through which 
the researchers then fired a 
powerful, focused laser beam. 
The team showed that the 
air channel guided the beam 
through 70 centimetres of 
air, and calculated that this 
waveguide could work over 
longer distances. 

Phys. Rev. X 4, 011027 (2014) 


| ENGINEERING =| 
Shake to make 
power 


A device that generates 
electricity through contact 
and friction might one day 
be used to harvest the energy 
from human motion to 
charge portable electronics. 
Zhong Lin Wang and his 
colleagues at the Georgia 
Institute of Technology 
in Atlanta designed a 


compact, light-weight 
generator (pictured), 
consisting of a copper- 
plated disk that spins 

and rubs against a static 
base containing a layer of 
electrodes and a conducting 
surface. The device can 
harvest mechanical energy 
from gentle wind, tap-water 
flow and normal body 
movements. 

The technology could be 
developed for large-scale 
power generation, the 
authors say. 

Nature Commun. 5, 3426 (2014) 
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Infant HIV cure 


A second HIV-infected 

child seems to have been 
successfully treated, according 
to the doctor who last year 
reported the first child to 

be cured of the infection 

(see Nature 504, 357-365; 
2013). Deborah Persaud, 

who studies paediatric 
infectious diseases at Johns 
Hopkins Children’s Center 

in Baltimore, Maryland, 
presented the results on 

5 March at the annual 
Conference on Retroviruses 
and Opportunistic Infections 
in Boston, Massachusetts. She 
said that a girl born last April 
to an HIV-infected mother 
was treated with retroviral 
drugs within four hours of 
birth, and that tests suggest 
she has been cured. The child 
continues to receive anti-HIV 
medication. 


Retraction call 


Two prominent research 
papers that describe a method 
for reprogramming mature 
cells into an embryonic 

state should be retracted, a 
co-author of the papers said 

in media reports on 10 March. 
The studies, published in 
Nature (H. Obokata et al. 
Nature 505, 641-647, 676-680; 


The total amount 

(US$253 million) that 
Italian regulators fined the 
Switzerland-based drug 
companies Novartis and 
Roche for colluding to block 
the use of an eye treatment 
in order to promote a more- 
expensive alternative drug 
that they jointly market. 


29 October 2013 


13 December 


15 November 


14 January 2014 


Asteroid caught in the act of falling apart 


For the first time, the Hubble Space Telescope 
has captured an asteroid in the process of 
breaking into pieces. A series of close-up 
images taken over several months (pictured) 
revealed that a fuzzy object first spotted in 
September 2013 is actually a set of ten rocky 


2014), apparently contain 
images duplicated from the 
doctoral dissertation of the 
lead author, Haruko Obokata 
of the RIKEN Center for 
Developmental Biology in 
Kobe, Japan. “I have lost 

faith in the paper; said 
Obokata’s co-author, Teruhiko 
Wakayama, a mouse-cloning 
expert at the University of 
Yamanashi. See go.nature. 
com/5gpzog for more. 


Live longer 


Genomics pioneer Craig 
Venter hopes to discover 

how to keep ageing adults 

fit and healthy for longer 

with the launch of a new 
company, Human Longevity, 
announced on 4 March. Based 
in San Diego, California, the 
business is a joint venture 


144 | NATURE | VOL 507 | 13 MARCH 2014 


L8; 2014). 


with Robert Hariri, chief 
executive of the stem-cell 
company Celgene Cellular 
Therapeutics in Summit, 
New Jersey, and the X Prize 
Foundation founder Peter 
Diamandis. It will sequence 
the genomes of cancer 
patients and their tumours, 
and the genomes of 40,000 
people per year from all 

age groups, with the aim of 
increasing this to 100,000 
per year, to build the largest 
database yet of human genetic 
information. 


Budget cuts bite 
The US National Institutes 
of Health awarded 750 fewer 
new research grants in 2013 
compared with 2012, an 
8.3% drop, as a result of the 
sweeping budget cuts known 
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fragments that are slowly drifting apart. A 
team led by astronomer David Jewitt at the 
University of California, Los Angeles, describes 
the findings this month in Astrophysical 
Journal Letters (D. Jewitt et al. Astrophys. J. 784, 


as sequestration that hit 
government agencies last 
year. The finding is part ofa 
report published on 6 March 
by the US Government 
Accountability Office, which 
notes that it will be years 
before the full effects of the 
cuts are known. The 2013 
sequestration also hit the US 
National Science Foundation, 
which awarded 690 fewer 
grants. See pages 139 and 147 
for more. 


China congress 
China’s Premier Li Keqiang 
declared a war on pollution 
in his opening speech to the 
annual National People’s 
Congress on 5 March. At 
the meeting, attended by the 
country’s leading lawmakers 
and politicians, Li said that 
green measures will target 
outdated energy production 


D. JEWITT (UCLA)/ESA/NASA 


NOAA 


SOURCE: INNOVATION UNION SCOREBOARD 2014 


and industrial processes. 

He also gave the nation’s 
researchers a boost, pledging 
cash for basic science. See 
page 148 for more. 


| PEOPLE 
ACS head retires 


The world’s largest scientific 
society, the American 
Chemical Society (ACS), 

will lose its chief, Madeleine 
Jacobs, when she retires at the 
end of this year. Jacobs has 
spent more than 24 years with 
the society, including 11 years 
as its chief executive. She came 
to the ACS after spending 

21 years in science journalism 
and public affairs, and during 
her tenure oversaw a boom 

in fund-raising activity that 
netted the society more than 
US$500 million per year. 


Smithsonian head 


David Skorton, the president 
of Cornell University in Ithaca, 
New York, will be the new 
head of the US Smithsonian 
Institution from July 2015. 
Skorton, a cardiologist who 
was formerly president of 

the University of Iowa in 

Iowa City, will replace Wayne 
Clough, the Smithsonian's 
Board of Regents announced 
on 10 March. The institution, 
a collection of museums 

and research complexes 
headquartered in Washington 
DC, has an annual budget of 
US$1.3 billion. 


TREND WATCH 


The European Union (EU) is 
getting more innovative, but its 
member states remain divided 
into leaders and laggards, suggests 
the European Commission's 
Innovation Union Scoreboard 2014 
(see chart). The worst countries 
barely improved, as measured 

by a set of metrics covering 
research systems, papers, patents, 
entrepreneurship and innovative 
firms. At the sub-national level, 
performance got worse in one- 
fifth of EU regions. Europe closed 
the gap between itself and the 
United States and Japan, however. 


New NOAA chief 


The US Senate approved 
Kathryn Sullivan to lead 

the National Oceanic and 
Atmospheric Administration 
(NOAA) on 6 March. Sullivan 
(pictured) is a former NASA 
astronaut and the first US 
woman to walkin space. She 
replaces marine ecologist Jane 
Lubchenco, who resigned 

in February 2013. Sullivan 
returned to the agency in 2011 
as its deputy administrator 
after serving as its chief 
scientist in the 1990s. 


Forensic fraud 
Disgraced forensic chemist 
Annie Dookhan, who worked 
at the Hinton State Laboratory 
Institute drug lab in Jamaica 
Plain, Massachusetts, until 
2012, acted alone when she 
falsified data and tampered 
with drug samples. So 
concludes Glenn Cunha, 
Massachusetts’ Inspector 
General, in a report dated 4 
March. However, the report 


adds that poor lab management 
and training, and weak security 
enabled the crimes. Dookhan 
admitted to the fraud in 2012 
and is currently serving a 
prison sentence. See go.nature. 
com/oohole for more. 


El Niitio cometh? 


The eastern equatorial Pacific 
Ocean might shift into a warm 
phase, known as El Nifio, 

in the next few months, the 
US National Oceanic and 
Atmospheric Administration 
(NOAA) said on 6 March. 
The phenomenon disrupts 
weather patterns around the 
globe and could become more 
frequent as a result of global 
warming. NOAA says that 
there is a 50% chance of an 

El Nifio developing during 
the Northern Hemisphere’s 
summer or autumn, but that 
the accuracy of forecasts will 
improve over the next two 
months. 


Coal-mine fine 

One of the largest coal 
companies in the United States 
has agreed to spend around 
US$200 million to reduce 
water pollution and monitor 
the environment at mines in 
the Appalachian Mountain 
region, the US Environmental 
Protection Agency announced 
on 5 March. Alpha Natural 
Resources, based in Bristol, 
Virginia, and its subsidiaries 


SPLIT STREAMS IN EUROPE’S INNOVATION 


There are stark differences in innovation performance in EU member 
states (clustered into four groups by the European Commission). 
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Innovation leaders: Sweden, Denmark, Germany, Finland. Innovation followers: Austria, 
Belgium, Cyprus, Estonia, France, Ireland, Luxembourg, the Netherlands, Slovenia, UK. 
Moderate innovators: Croatia, Czech Republic, Greece, Hungary, Italy, Lithuania, Malta, 
Poland, Portugal, Slovakia, Spain. Modest innovators: Bulgaria, Latvia, Romania. 


SEVEN DAYS | THIS WEEK | 


16-18 MARCH 

A symposium entitled 
‘The Evolution of 
Modern Humans — 
From Bones to Genomes 
takes place in Sitges, 
Spain. Hosted by Cell 
Press, the meeting will 
discuss multidisciplinary 
approaches to studying 
the evolution of 

Homo sapiens. 
go.nature.com/63txud 


24-25 MARCH 
Representatives of more 
than 50 nations meet 

at the third Nuclear 
Safety Summit in The 
Hague, the Netherlands. 
They will discuss ways 
to reduce the amount 
of nuclear material in 
the world and tackle 
smuggling. 
go.nature.com/pdiaam 


will also pay fines totalling 
$27.5 million for thousands of 
violations of the Clean Water 
Act. The settlement covers 

79 mines and 25 processing 
plants in 5 states, including 
West Virginia. 


Radiation contained 


No radioactive contamination 
or air-quality problems were 
detected by the first probes 
lowered into the Waste 
Isolation Pilot Plant near 
Carlsbad, New Mexico, the 
US Department of Energy 
announced on 9 March. One 
section of the nuclear-waste 
facility suffered a leak last 
month (see Nature http://doi. 
org/rtw; 2014), but energy- 
department officials say 

that the plant's air-filtration 
system stopped the radiation 
reaching other parts of the 
facility. The results must be 
confirmed before employees 
can re-enter the plant and 
begin investigating the cause 
of the leak. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 


13 MARCH 2014 | VOL 507 | NATURE | 145 


© 2014 Macmillan Publishers Limited. All rights reserved 


JONATHAN ERNST/REUTERS/CORBIS 


NEWSIN FOCUS 


POLICY China seeks to boost 
neglected basic-research 
budget p.148 


NUTRITION Scientists foresee ENERGY Flagging US ECOLOGY Top predators may 
industry backlash over sugar biofuel industry gambles not control ecosystem 
guidelines p.150 on plant waste p.152 structure p.158 


President Barack Obama’s 2015 budget has disappointed. 


Obama’s budget 
request falls flat 


Hopes dim for a science-funding increase in 2015. 


BY LAUREN MORELLO, JESSICA MORRISON, 
SARA REARDON, JEFF TOLLEFSON AND 
ALEXANDRA WITZE 


hat a difference a year makes. In 
April 2013, US President Barack 
Obama unveiled a budget proposal 


that set out plans to map the human brain and 
to capture an asteroid that could be towed 
near to the Moon for astronauts to study. But 
that sense of ambition is absent from Obama's 
plan for fiscal year 2015, which was released 
on 4 March, frustrating scientists and policy 
analysts who hoped for a show of support 
from the White House after years of pressure 
to reduce government spending. 


The US$3.9-trillion proposal would keep 
budgets flat or nearly so at major science agen- 
cies, including the National Institutes of Health 
(NIH), the National Science Foundation (NSF) 
and NASA. Overall, it includes just $65.9 bil- 
lion for non-defence research and develop- 
ment, a paltry 0.7% above the current level. 

The proposal seeks an extra $5.3-billion boost 
for science from a new initiative that is separate 
from normal agency budgets, to be paid for in 
large part by new taxes on the wealthy. But few 
observers expect Congress to accept that plan, 
dimming hopes for a significant boost for sci- 
ence funding in 2015. “We have to convince the 
legislators that they have to make some tough 
decisions and give more money to science,” 


says Samuel Rankin, director of the American 
Mathematical Society's Washington DC office. 
“That's easier said than done,’ 

At the NIH, Obama is seeking $30.2 billion 
in 2015, compared with this year’s $30 billion 
(see ‘Budget highlights’) — a rise of less than 
1%. That is not enough to outpace the expected 
rise in research costs in 2015, which according 
to the agency’s own Biomedical Research and 
Development Price Index will be 2.2%, higher 
than the US inflation rate. One of the few 
bright spots in the NIH plan is continued sup- 
port for the Brain Research through Advanc- 
ing Innovative Neurotechnologies Initiative. 
It would receive $100 million in 2015, up from 
$40 million this year. 

“Scientists are constitutionally not comfort- 
able with the status quo, and that’s what this is 
— not only the money but the idea,” says How- 
ard Garrison, deputy executive director for 
policy at the Federation of American Societies 
for Experimental Biology in Washington DC. 
Stefano Bertuzzi, executive director of the 
American Society of Cell Biology in Bethesda, 
Maryland, is blunter. “A 1% bump is not going 
to cut it? he says. 

The NSF would see its budget creep up to 
$7.3 billion, roughly 1% above the current 
level. Of the agency's seven research directo- 
rates, only Social, Behavioural and Economic 
Sciences (SBE) would receive a noticeable 
boost, rising by 6% to $272 million. Most of 
that is earmarked for the National Center for 
Science and Engineering Statistics (NCSES), 
which tracks trends in scientific research, 
education and workforce development. Rick 
Wilson, a political scientist at Rice University 
in Houston, Texas, would have liked to see 
more SBE money made available to individual 
researchers, but says that the NCSES data are 
“really useful” for social scientists. 

At NASA, the proposed $17.5-billion total 
budget pushes forward the administration's 
plans to capture and study an asteroid, despite 
lawmakers’ scepticism about the mission's value. 
Asteroid experts at NASA’s Jet Propulsion Labo- 
ratory in Pasadena, California, have been hunt- 
ing for candidate targets, while astronauts at the 
Johnson Space Center in Houston, Texas, are 
designing equipment for a close-up encoun- 
ter with a space rock. NASA has asked for 
$133 million — an increase of $55 million over 
this year — for the programme, which NASA 
chief Charles Bolden has described as a way 
to test new exploration capabilities aimed > 


13 MARCH 2014 | VOL 507 | NATURE | 147 


© 2014 Macmillan Publishers Limited. All rights reserved 


| NEWS IN FOCUS 


BUDGET HIGHLIGHTS 


How science agencies fared in the budget (US$ millions). 


Agency 

National Institutes of Health 

Centers for Disease Control and Prevention 

Food and Drug Administration 

National Science Foundation 

NASA (science) 

Department of Energy Office of Science 
Environmental Protection Agency 

National Oceanic and Atmospheric Administration 


US Geological Survey 


2013 actual* 2014 actual 2015 request 
29,001 30,003 30,203 

5,503 5,882 5,474 

2,386 2,640} 2,584 

7,105 7,142 7,255 

4,782 Dylbil 4972 

4,681 5,066 5,111 

WSIS 8,200 7,890 

4,906 5,322 5,497 

1,012 1,032 1,073 


*2013 figures include the roughly 5% across-the-board cut of the budget sequester. {Includes one-time transfer of 


$79 million in user fees. 


> at eventually sending astronauts to Mars. 
If Congress approves the budget proposal, the 
agency will select the basic outlines of the mis- 
sion plan early next year. 

Obama's budget request also includes 
$15 million for a spacecraft to Europa, an 
icy moon of Jupiter. The move is an apparent 
nod to Congress, which last year inserted $80 
million into the NASA budget explicitly for 
Europa, despite the administration’s objec- 
tions. At times, NASA has sounded less than 
enthusiastic about this moon; Beth Robinson, 
the agency’s chief financial officer, told report- 
ers last week that Europa was a challenging 


environment and that she did not foresee a 
mission to the moon launching before the 
mid-2020s. 

Overall, NASA’s science budget would drop 
nearly 3.5% under the president's proposal, to 
$4.97 billion. The biggest casualty would be 
the Stratospheric Observatory for Infrared 
Astronomy, a joint US-German project involv- 
ing a 2.5-metre airborne telescope. NASA says 
that if other partners cannot supply most of 
the plane's operating budget, it will ground the 
observatory from 1 October (see Nature http:// 
doi.org/rtk; 2014). 

Also facing a cut is high-energy physics 


research at the Department of Energy (DOE), 
which would fall by 6.8% to $744 million. 
That has sparked concerns that the US physics 
community has been too slow to unite behind 
a viable research agenda. With unfortunate 
timing, a DOE advisory panel charged with 
producing such an agenda is scheduled to 
release its results two months from now. Under 
consideration are proposals for a new domestic 
neutrino experiment and US participation in a 
linear collider, likely to be destined for Japan, 
that would build on advances at the Large 
Hadron Collider in Geneva, Switzerland. 

“The White House is certainly sending a 
signal to the high-energy physics commu- 
nity that it needs to get its act together,” says 
Michael Lubell, director of public affairs for 
the American Physical Society in College Park, 
Maryland. 

To Andrew Lankford, a physicist at the 
University of California, Irvine, who leads the 
DOE's High Energy Physics Advisory Panel, 
the move is not surprising, given Obama’s 
emphasis on climate and clean-energy research 
and development at the department — these 
saw a significant boost in the White House 
budget proposal. Lankford says that the pro- 
posal would make it “a challenge to main- 
tain the vitality of our research community” 
— but he is confident that his panel’s report 
will be completed in time to influence budget 
negotiations in Congress. m SEE EDITORIALP.139 


SOURCE: WHITE HOUSE OFFICE OF MANAGEMENT AND BUDGET 


China goes back to basics 
on research funding 


Core science gets budget boost ina bid to change research culture and increase innovation. 


BY JANE QIU 


ast week, Chinese science saw some big 
L«= as Premier Li Keqiang delivered 

his first budget since taking office a year 
ago. Yet observers have warned that to translate 
that support into innovation, the country must 
invest more in basic research and move away 
from its desire for quick successes. 

China’s total expenditure on research and 
development (R&D) has increased by 23% a 
year on average over the past decade. But with 
uncertainties arising from a new government 
and the effects of the economic slowdown, 
scientists had feared cutbacks this year. 

At the opening session of the annual National 
People’s Congress in Beijing, however, Li 
reassured the research community by stressing 


CASH DRAW 


In most sectors, China spends more on developing existing technologies than on basic or applied science. 
The difference is more pronounced than in some other countries. 
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the importance of scientific innovation to 
economic growth — and by pledging hard cash. 

The central government's expenditure on 
science and technology this year was set at 
US$43.6 billion (267.4 billion yuan renminbi), 
an 8.9% rise on last year, which slightly trails the 
overall projected budget increase of 9.3%. The 
biggest winners are 16 ‘megaprojects’ with an 
emphasis on engineering and applied research 
in areas such as transgenic crops, nuclear power 
plants and lunar exploration, which together 
will receive a whopping $8.1 billion. 

China's basic-research spending has his- 
torically been extremely low — about 4.8% 
in 2012 and 2013, compared with 10-25% in 
developed nations (see ‘Cash draw’). But this 
year, the appropriation for basic research will 
increase by 12.5% to $6.6 billion — of which 
the National Natural Science Foundation of 
China is slated to get $3.1 billion, says its presi- 
dent, Yang Wei. The major areas that the foun- 
dation will fund include studies of biodiversity, 
air pollution, supercomputers, neurodegenera- 
tive diseases and scientific equipment. 

Two of the 16 megaprojects have a substan- 
tial basic-research component: these are in the 
areas of drug discovery and major infectious 
diseases, including HIV/AIDS and influenza. 
And with a combined budget of $488 million, 
the two initiatives “will continue to strengthen 
the capacity for drug screening, rapid detec- 
tion of pathogens and vaccine development’, 
says Liu Qian, deputy director of the National 
Health and Family Planning Commission, 
who oversees the projects. 

The Ministry of Science and Technology will 
spend about 8% of its total budget of $8.1 bil- 
lion on basic research — including $211 million 
on six major science programmes in areas such 
as nanotechnology, quantum physics, stem 
cells and protein science — and $1.1 billion on 
developing key technologies. 

The Chinese Academy of Sciences (CAS), 
which relies mostly on extramural grants to 
fuel its research, will receive $423 million for 20 
‘strategic priority projects’ in areas ranging from 
neuroscience to studies of the Tibetan Plateau. 

Although such strong support is welcomed, 
observers such as Richard Suttmeier, a policy 
researcher at the University of Oregon in 
Eugene who advises China’s science ministry, 
fear that more money may not bring more inno- 
vation without a sea change in howit is spent, 
and without a shift in China's research culture 


IN FOCUS | NEWS 


Premier Li Keqiang pledged support for basic science at last week’s National People’s Congress in Beijing. 


and institutions. This is because of serious 
concerns about the country’s quality of research. 

“While China’s output of publications and 
patents is impressive, there are very few genuine 
innovations,” says CAS president Bai Chunli. 
A key reason, says Yang, is the modest govern- 
ment support for basic research. Moreover, he 
adds, “a large chunk of China’s R&D expendi- 
ture comes from industry”. In 2012, industry 
contributed 76% of all R&D funding but spent 
almost nothing on basic science and only 3% 
on applied research, he says (see ‘Cash draw’). 
Consequently, 84% of China's total R&D spend- 
ing goes on product development, such as the 
commercialization of technologies, compared 
with 35-65% in developed economies. 

Even research institutes spend only 13% of 
their funding on basic science and less than 
one-third on applied research, and are under 
increasing pressure to engage in development 
projects that focus on real-world problems. 
Without a firm footing in solid research, such 
projects “have limited value in developing an 
innovative economy’, says Su Jun, a policy 
researcher at Tsinghua University in Beijing. 

Chinese scientists are also concerned by the 
serious misuse of research grants. A report 
released by China’s National Audit Office last 
October shows that the problem is widespread 


— and that up to half of all research funding 
has been misused. In a crackdown, one vice- 
president of Zhejiang University was arrested 
last December on “suspicion of occupational 
crime’, according to the Xinhua news agency, 
and 50 officials at the Guangdong Provincial 
Department of Science and Technology are 
under investigation for embezzling R&D funds. 
When research grants are used as intended, 
Chinese science “suffers from excessive bureau- 
cratic interference and a culture of jigong jinli 
— seeking quick success and short-term gain’, 
says Muming Poo, director of the CAS Institute 
of Neuroscience in Shanghai. Officials often 
demand the demonstration of productivity 
on an almost yearly basis, and grants can be 
slashed by 50% if researchers fail in this task. 
“The Chinese government is aware of the 
problems and institutional reforms are firmly 
on the agenda of the congress to boost R&D 
efficiency,’ says Bai. Topics under discussion at 
the congress included increasing the proportion 
of basic research in total R&D to 10% by 2020; 
instigating policies to encourage investment in 
basic and applied research; reforming funding 
and evaluation systems; raising the budget for 
research overheads; and giving scientists more 
freedom. Such changes “will be crucial for Chi- 
nese science to reach the next level’, says Bai. = 
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The WHO recommends that adults have less sugar per day than is found in one glass of many soft drinks. 


NUTRITION 


Storm brewing over 
WHO sugar proposal 


Industry backlash expected over suggested cut in intake. 


BY BRIAN OWENS 


cientists are gearing up for a battle with 
the food industry after the World Health 


Organization (WHO) moved to halve its 
recommendation on sugar intake. 

Nutrition researchers fear a backlash similar 
to that seen in 2003, when the WHO released 
its current guidelines stating that no more than 
10% of an adult's daily calories should come 
from ‘free’ sugars. That covers those added to 
food, as well as natural sugars in honey, syrups 
and fruit juice. In 2003, the US Sugar Asso- 
ciation, a powerful food-industry lobby group 
based in Washington DC, pressed the US gov- 
ernment to withdraw funding for the WHO 
if the organization did not modify its recom- 
mendations. The WHO did not back down, 
and has now mooted cutting the level to 5%. 

“These are reasonable limits, says Walter 
Willett, head of nutrition at the Harvard School 
of Public Health in Boston, Massachusetts. “Five 
per cent of calories is just a bit less than in a typi- 
cal serving of soda, and we have good evidence 
of increased risk of diabetes with that intake, 
which of course increases with greater intake.” 

But Marion Nestle, a nutrition researcher 
at New York University, predicts that grocery 
manufacturers are not going to take the pro- 
posal lying down. “If people follow this advice, 
that would be very bad for business,” she says. 

The WHO made its recommendations in 
draft guidelines that were released for public 
consultation on 5 March. In halving the 10% 
figure, it cited the need to fight obesity — world- 
wide incidence reached 11% in 2008 — and 


to prevent tooth decay. Five per cent of daily 
calories is equivalent to about 25 grams, or 
6 teaspoons, of sugar. Many people around the 
world consume more than that — young adults 
in the United States, for example, get more than 
14% of their calories from free sugars, according 
to the US Centers for Disease Control and Pre- 
vention in Atlanta, Georgia (see ‘Sugar high’). 

The guidelines are based ona careful analy- 
sis of more than 120 scientific studies, sum- 
marized in two meta-analyses commissioned 
by the WHO (L. Te Morenga et al. Br. Med. J. 
346, e7492; 2013; P. J. Moynihan and S. A. M. 
Kelly J. Dent. Res. 93, 8-18; 2014). 

Jim Mann, a nutrition researcher at the Uni- 
versity of Otago in New Zealand, who worked 


SUGAR HIGH 


The latest figures show that average consumption 
of sugar by males and females in the United 
States exceeds that recommended by the WHO. 
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on one of the meta-analyses and helped to 
develop the guidelines, says that the science 
supporting a drop in sugar consumption has 
become more conclusive since 2003. But the 
biggest difference is in the process the WHO 
uses to produce its recommendations. 

For the first time in the production of nutri- 
tion guidelines, the agency adopted the Grading 
of Recommendations Assessment, Develop- 
ment and Evaluation (GRADE) system, a more 
formal, standardized approach to developing 
guidelines compared with a literature review. It 
requires a clear statement of the research ques- 
tion, uses the gold-standard methodology for 
literature review and meta-analysis developed 
by the Cochrane Collaboration — a non-profit 
group headquartered in Oxford, UK, dedicated 
to the systematic analysis of medical research — 
and weighs up biases and confounding factors 
before an expert committee develops recom- 
mendations. This painstaking process “doesn’t 
give much leeway for opinions’, says Mann. 

His analysis showed a strong confirmation 
of the benefits of the 10% limit, especially for 
preventing tooth decay, with “good clues” that 
it would be worth going lower, although the 
evidence for that is weaker. 

This is likely to form the focus of the sugar 
lobby’s attacks, researchers say. Most industry 
groups have refused to comment until they have 
prepared submissions to the consultation, but a 
few have already criticized the lower limit. The 
US Sugar Association, for example, released 
a statement pointing out that the US Institute 
of Medicine and the European Food Safety 
Authority have said in the past (in 2005 and 
2010, respectively) that there was no conclusive 
evidence to justify such a limit on free sugars. 

Industry submissions to the consultation 
are likely to be forceful. When the WHO rec- 
ommended the 10% limit, it faced a ferocious 
attack on the credibility of its science from 
several camps — including the administra- 
tion of then US President George W. Bush. The 
administration said that the WHO report did 
not meet US data-quality standards, was not 
properly peer-reviewed, and failed to separate 
scientific and policy recommendations. 

Nestle thinks that ifthe WHO is willing to 
face that kind of pressure again, it must have 
confidence not only in its science, but also in 
the political climate. “There's so much evi- 
dence now that says that people would be 
healthier if they ate less sugar, it may be that 
things have changed,” she says. 

This time around, the WHO is taking steps to 
counter excessive lobbying. Anyone who wishes 
to submit a comment on the draft guidelines 
must first complete a declaration-of-interest 
form. And the organization says that it will 
stand firm against any push-back from the food 
industry. “If pressure comes to the organization, 
then we're very well equipped to resist that type 
of pressure,’ said Francesco Branca, director of 
the WHO’s Department for Nutrition for Health 
and Development, at a press conference. = 
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Global seismic network 
takes to the seas 


Two systems could plug the ocean- sized gap in earthquake detection. 


BY NICOLA JONES 


ore than 25 years after its inception, 
Mee: is hope that the Global Seismo- 

graphic Network (GSN) will finally 
live up to its name. The network’s 150 or so seis- 
mic stations listen for signs of earthquakes and 
nuclear tests, and help geophysicists to image 
Earth’s interior, but their scope is limited: they 
are all located on land. 

The GSN’s vast marine blind spot could 
soon be eliminated, however, thanks to new, 
relatively inexpensive equipment that will be 
field-tested between April and June. For John 
Orcutt, a geophysicist at the Scripps Institution 
of Oceanography in La Jolla, California, the 
prospect of truly global measurements is tan- 
talizing. “Working out how the interior of the 
planet works is really hard when all your sensors 
are only on 30% of the planet's surface,” he says. 

The GSN, which is run jointly by the Incor- 
porated Research Institutions for Seismology in 
Washington DC and the US Geological Survey 
(USGS), was originally intended to blanket the 
globe with sensors. But installing permanent 
seismic monitors on the sea floor proved to be 
too expensive. Thousands of metres of cable 
are needed to connect the monitors to surface 
buoys that transmit data in real time, and the 
bulky equipment must be deployed from costly 
research ships. “It’s the ship time that kills you,’ 
says Jonathan Berger, a geophysicist at Scripps 
who has been involved with the GSN since its 
inception. Placing and maintaining 2,250 sea- 
floor stations, spaced roughly 400 kilometres 
apart, would cost between US$700 million and 
$1 billion over five years, says Guust Nolet, a 
geophysicist at the University of Nice Sophia 
Antipolis in France. 

Faced with such a steep price tag, researchers 
have made do with half measures. At any given 
time, there are a few hundred seismic stations 
temporarily deployed on the ocean floor, storing 
data until they can be picked up by ship — usu- 
ally once a year. The largest number is overseen 
by the Ocean Bottom Seismograph Instru- 
ment Pool (OBSIP), which is funded by the 
US National Science Foundation (NSF). These 
temporary stations are useful for retrospec- 
tive analyses, such as tracing how earthquakes 
echoed through the Earth and helping to cal- 
culate the location of molten plumes inside the 
mantle. But they cannot be used for anything 


UNDERWATER EARS 


New communication technologies could make 
it cheaper and easier to monitor seismic activity 
on the sea floor. 


MERMAID float 


Free-floating hydrophone 
that can detect pressure 
waves from earthquakes. 


ADDOSS unit 
Wave-powered ‘glider’ that 
connects to ocean-bottom 
seismometer. 


that requires real-time data, such as earthquake 
monitoring. 

For that purpose, a few nations, including 
Japan and Canada, have installed expensive 
wired arrays of offshore seismic stations that 
receive power and send data along fibre-optic 
cables. The United States will soon install its 
own array as part of the NSF’s Ocean Obser- 
vatories Initiative (see Nature 501, 480-482; 
2013). But for global monitoring, more practi- 
cal and affordable options are now surfacing. 

Berger recently began extended field trials of 
his Autonomously Deployed Deep-Ocean Seis- 
mic System (ADDOSS), which uses ‘gliders’ that 
convert wave motion into thrust. Comprised 
ofa submerged portion and a surfboard-sized 
surface float equipped with solar panels anda 
satellite positioning system, the gliders are able 
to wirelessly retrieve data from seismometers on 
the ocean floor (see ‘Underwater ears’). Built by 
Liquid Robotics of Sunnyvale, California, they 
are light enough to be installed and maintained 
by regular ships rather than specialized research 
vessels. If they experience a problem, the gliders 
can be programmed to ‘swim to shore. Berger 
says that his team has also designed, but not yet 
built, a sleek ocean-bottom seismometer that 
the gliders can tow to a research site. 

“It is a technology that can enable things 
we have wanted to do for a long time for basic 
science and earthquake-hazard studies,” says 


Thorne Lay, a seismologist at the University 
of California, Santa Cruz, who is not affiliated 
with the project. Ocean stations should be able 
to detect small offshore earthquakes that are 
missed by instruments on land, and they will 
yield discoveries about Earth’s mantle, he says. 

Berger's first long-term test of an ADDOSS 
station this winter was interrupted when a 
glider experienced problems. He will 
try again in May or June; if all goes 
well, he envisages deploying 20 such 
stations across the world’s oceans, 
roughly 2,000 kilometres apart. 
The cost of each station — less than 
$200,000 — would be comparable to that of 
installing and maintaining one of the GSN’s 
existing land-based seismic stations. 

A parallel effort pursued by Nolet aims to 
provide even cheaper ocean coverage. The 
Mobile Earthquake Recorder in Marine Areas 
by Independent Divers (MERMAID) system is 
a set of free-floating buoys that drift with the 
current. Each buoy carries a hydrophone that 
can detect pressure waves from large or nearby 
earthquakes but cannot sense the motions of 
the sea floor. Nolet estimates that he could 
blanket the globe with 300 such devices for 
$24 million. Tests of four buoys in the Indian 
Ocean this winter proved that they can ‘hear’ 
through the noise of stormy seas. 

In April, Nolet plans to deploy ten more 
devices to image the mantle plume that lies 
under the Galapagos Islands. Some previ- 
ous efforts to map plumes have made use of 
OBSIP. “It’s a great programme, but it’s very 
expensive and it can’t do everything,” says 
Cecily Wolfe, a USGS seismologist based at the 
University of Hawaii in Honolulu who used 
the network to investigate the plume beneath 
Hawaii. Technology such as MERMAID or 
Berger's ADDOSS programme could one day 
do similar work more cheaply, and their meas- 
urements could also be combined with those 
of OBSIP to help researchers to recognize 
and filter out ‘contamination from seismic 
signals originating outside their survey zone, 
Wolfe says. 

If the new technologies succeed, it will be 
a remarkable change for science, says Orcutt. 
But it will be a while before the systems reach 
their full potential, he adds. “We'll need a cou- 
ple of decades of observation before things 
really start to come into focus.” = 
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The Abengoa cellulosic ethanol plant near Hugoton, Kansas, will start production this year. 


Cellulosic ethanol 
fights for life 


Pioneering biofuel producers hope that US government 
largesse will ease their way into a tough market. 


BY MARK PEPLOW 


n the flat plains of Kansas, a stack 
(): gleaming steel towers and pipes 

stretches 16 storeys into the sky. More 
than 1,000 construction workers toiled to 
complete the ethanol plant near the town of 
Hugoton, and its owners expect it to join a 
fermented-fuel revolution. 

But unlike most ethanol factories, in which 
yeast feeds on sugars in foodstuffs such as 
maize (corn) kernels, the Hugoton facility 
will make use of what has been, until now, 
agricultural waste: cellulose. Thousands of 
tonnes of corn stover — the leaves, stalks and 
husks left over after the maize harvest — are 
already waiting, stacked in square bales, at the 
1.6-square-kilometre site. By June, the plant 
will begin processing the stover into ethanol, 
which will be blended with petrol and end up 
in vehicle fuel tanks. 

The plant, which is owned by multinational 
company Abengoa of Seville, Spain, is one of 
three US facilities that should start commercial 
production of cellulosic ethanol in the next few 
months (the others are both in Iowa, one run by 
POET-DSM Advanced Biofuels and the other 


by DuPont). The industry has long promised 
that this second-generation biofuel will cut 
greenhouse-gas emissions, reduce US reliance 
on imported oil and boost rural economies. Yet 
just as the fuel is on the cusp of making it big, 
market forces and government policies could 
choke its progress. “This is going to be a very 
critical year,’ says Zia Haq, a chemical engi- 
neer and senior analyst at the US Department 
of Energy, which has 

helped to fund the 

plants. The challenges 

have already pushed 

some researchers and 

companies towards 

an alternative approach that converts cellulose 
into hydrocarbon fuels using chemical rather 
than biological processes. 

With more than 200 operating plants, the 
corn-ethanol industry is well established in 
the United States. Its dramatic growth has 
been driven by tax credits and the Renewable 
Fuel Standard (RFS), created by law in 2005 
and extended in 2007. Administered by the 
US Environmental Protection Agency (EPA), 
the standard mandates annual increases 
in the volumes of various renewable fuels 
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included in the country’s fuel supply. In its 
early years, the law emphasized the produc- 
tion of corn ethanol, considered ripe for early 
commercialization. 

Yet corn ethanol comes with problems. It 
offers only modest savings in greenhouse-gas 
emissions compared to petrol (see Nature 499, 
13-14; 2013). Production is vulnerable to poor 
harvests and can contribute to increased food 
prices because the maize must be grown on 
land that would otherwise be used for food. 
Tapping the storehouse of biomass left after 
the harvest is much less controversial. Ethanol 
made from corn stover produces at least 60% 
less greenhouse-gas emissions than petrol, and 
making it does not require any extra farmland. 

Brewing such cellulosic ethanol, however, 
is hard. Producers must dismember large, 
indigestible molecules such as cellulose and 
hemicellulose to yield fermentable sugars. 
The process requires the biomass to be ground 
up and pretreated with acids. A cocktail of 
enzymes must then be applied to chop up the 
tough biological polymers inside — all before 
the yeast is added to the resulting sugars. 
Hence the scale of Abengoa’s processing facil- 
ity, much larger and more expensive than any 
corn-ethanol plant. According to the RFS, 
commercial production of cellulosic ethanol 
was meant to start around 2010, but that did 
not happen. With patchy investment backing, 
many companies have fallen by the wayside. 


BLEND WALL 

Part of the problem is that the ethanol mar- 
ket is already saturated. In 2012, the industry 
produced more than 50 billion litres of corn 
ethanol, comprising 10% of US transportation 
fuel — enough to completely satisfy demand 
for the E10 petrol blend that most vehicles now 
burn (see ‘Hitting the wall’). This “blend wall 
puts an upper limit on the amount of ethanol 


ABENGOA BIOENERGY 


SOURCE: CELLULOSIC BIOFUELS INDUSTRY PROGRESS REPORT 2012-2013 (AEC, 2012) 


SOURCE: EIA 


that the market can absorb. And it is closing 
in: demand for petrol has actually fallen, and 
there is growing interest in alternatives such as 
battery-powered cars. Cellulosic ethanol may 
now be arriving, but its timing is terrible. “We 
don’t have room for any more ethanol,” says 
Wallace Tyner, an agricultural economist at 
Purdue University in West Lafayette, Indiana. 

Corn ethanol is now slightly cheaper than 
petrol, but cellulosic ethanol is more expen- 
sive than both. A cellulosic-ethanol plant’s 
capital costs are roughly twice those ofa corn- 
ethanol plant, says Tyner, and enzymes raise 
operational costs further. Unable to undercut 
its rivals, cellulosic ethanol will be heavily 
dependent on the RFS to clear its path to the 
pumps. Yet its delayed arrival has prompted 
the EPA to reduce the amount of cellulosic 
ethanol that refiners are required to blend into 
their petrol. 

The RFS plan for this year originally called 
for 6.6 billion litres of cellulosic ethanol. But 
in November, the EPA proposed that the man- 
date should be reduced to 64 million litres, a 
mere trickle in comparison. A final ruling is 
expected in March or April. Groups working 
on renewable fuel, who say that producers will 
easily make more than 64 million litres once 
they get going, have cried out. “We think the 
EPA underestimated the capacity of the indus- 
try,’ says Christopher Standlee, executive vice- 
president of global affairs for Abengoa. The 
expensive excess ethanol might have to be sold 
at a loss on the open market, potentially crip- 
pling the fledgling industry. 


CAPACITY PROBLEM 

Cellulosic-ethanol producers have several 
options to increase their market. First, they 
could break through the blend wall. All US 
vehicles produced in the past decade can run 
ona 15% ethanol-petrol blend — but consum- 
ers and distributors are mostly unconvinced, 
perhaps spooked by car-industry studies 
claiming that the fuel damages engines. 

Another way over the wall might be exports 
to the European Union, which aims to make 
10% of its transportation fuel renewable by 
2020. And cellulosic ethanol could get cheaper 
with more efficient stover harvesting, beefier 
enzymes and cheaper pretreatments. The 
industry has already cut costs from as much as 
US$9 per gallon ($34 per litre) five or six years 
ago to close to $2 today, says Thomas Foust, 
director of the National Bioenergy Center, part 
of the National Renewable Energy Laboratory 
in Golden, Colorado. 

But Tyner says that this approach can be 
squeezed only so far. He and others see more 
promise in a different approach to breaking 
up cellulose — a brute-force combination of 
temperature, pressure and chemistry. These 
thermochemical methods can produce either 
a crude bio-oil or a stream of carbon mon- 
oxide and hydrogen known as syngas. After 
further treatment and refining with the help 
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of chemical catalysts, both can be turned into 
hydrocarbons such as petrol, diesel and jet fuel. 
Crucially, these ‘drop-in fuels, named because 
they can replace normal fuels with no adjust- 
ments to engines, have no blend wall to vault. 

Thermochemical routes can also use lower- 
quality feedstocks, tearing through anything 
from wood chips to municipal solid waste. 
Enerkem, a company in Montreal, Canada, is 
starting its first commercial-scale plant to turn 
solid waste into syngas, in Edmonton. By April 
or May, it will be able to transform the syngas 
into methanol. Next year, it plans to convert 
the methanol into ethanol, and it says that will 
be cheaper than corn ethanol. 

This is mostly because the feedstock is 
cheap. US landfill fees for solid waste are about 
$44 per tonne, not including transportation, so 
municipalities are keen to get companies such 
as Enerkem to take the waste off their hands, 
says Marie-Héléene Labrie, the company’s vice- 
president for government affairs and commu- 
nications. “Generally, we're being paid to take 
the feedstock.” 


HITTING THE WALL 


There is already enough corn ethanol to satisfy US 
demand for a widely used 10% ethanol-petrol blend, 
in part because petrol consumption is declining. 
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Producing syngas also gives the company a 
lot of flexibility. If there are changes to policy 
mandates or the market, the system could 
be switched to making hydrocarbon fuels or 
higher-value chemical products. Enerkem 
plans to build similar plants in Mississippi 
and Quebec next year, and it is partnering 
with Waste Management in Houston, Texas 
— the biggest US landfill operator — to take 
its waste. 

Research funding, too, is shifting to thermo- 
chemical methods, says Haq. “That doesn't 
mean we're abandoning cellulosic ethanol, he 
says. “But certainly, going forward, we're look- 
ing more seriously at hydrocarbon pathways.” 

Last year, an energy-department project to 
supply the US Navy with advanced biofuels 
provided funding for four facilities that will 
all use thermochemical methods to make 
drop-in fuels. Thermochemical processes are 
also key to the first two commercial cellulosic 
plants in the United States, which opened last 
year: KiOR in Columbus, Mississippi, and 
INEOS Bio near Vero Beach, Florida (see 
‘Power plants’). (Both plants are currently idle, 
pending upgrades.) Haq thinks that longer- 
lived catalysts will further reduce the costs of 
thermochemically produced cellulosic hydro- 
carbons in coming years. 

But Standlee says that biology can still com- 
pete, by tackling ever-cheaper feedstocks. His 
company is betting that a new generation of 
enzymes can turn municipal waste into etha- 
nol, and last July it opened a demonstration 
plant near Salamanca, Spain, to do just that. 
Abengoa hopes that this technology will even- 
tually allow it to expand its US operations 
beyond the ‘corn belt’ 

Standlee says that, as long as the cellulosic 
industry is given time to mature — just as corn 
ethanol was — it can get back on the trajectory 
set out by the RFS. “If” he adds, “the EPA sticks 
with the programme.” m 
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SLANDS OF LIGHT 


More than a billion people lack electricity, but now 
microgrids are powering up rural areas. 


Western Hemisphere, some residents spend 

US$10a month on candles and kerosene just 
to light their homes — roughly 125 times what 
those in the United States typically pay for the 
equivalent light. In India, many pay a premium 
to charge their mobile phones from car batter- 
ies at the local market. The Sun still dictates life 
for millions of Africans, and diesel generators 
burn through budgets on small Pacific islands. 
Around the world, nearly 1.3 billion people live 
without access to electricity, many of them far 
from the ever-expanding electric grid. 

The quest is on to find the best way to 
bring clean power to rural areas. Mixing local 
development work with Silicon-Valley-style 
entrepreneurship, engineers, scientists and 
economists are setting up independent ‘micro- 
grids’ that can be deployed quickly and cheaply 
one community at a time. Those leading such 
electrification schemes aim to create small- 
scale renewable-energy systems, building 


| n Haiti, the least-electrified country in the 
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an archipelago of light across the developing 
world and helping remote communities to kick 
their dependence on fossil fuels. 

Such efforts have often failed in the past, as 
subsidies lapsed or infrastructure collapsed. 
But today’s entrepreneurs are better placed to 
succeed. A new generation of cheaper photo- 
voltaic panels and wind turbines can be man- 
aged with simple smart-grid devices. The 
price of fossil fuels has soared over the past 
decade, making renewable energy more com- 
petitive. And the United Nations has set a goal 
of achieving universal access to electricity by 
2030, providing political impetus. 

“The ambition is there, and the economics 
are making a lot more sense now than they 
were a few years ago,’ says Richenda Van Leeu- 
wen, executive director for energy access at the 
United Nations Foundation. But the challenge 
remains extreme. A 2012 analysis by the Inter- 
national Energy Agency projects that, on the 
basis of current plans, the percentage of people 


VOL 507 | 13 MARCH 2014 


© 2014 Macmillan Publishers Limited. All rights reserved 


without access to electricity will fall from 19% 
in 2010 to 12% in 2030 — leaving nearly 1 bil- 
lion people still in the dark. Achieving uni- 
versal energy access would mean increasing 
investments from a projected $14 billion to 
$49 billion a year, the agency says. Centralized 
grids are expected to provide only about 30% 
of the solution in rural areas. 

Among the projects already on the go are 
a few bright spots with lessons to teach about 
technologies and business models that could 
help to light the world. 


TAMKUHA, INDIA 

When a pair of young Indian entrepreneurs 
flipped the switch to electrify the remote agri- 
cultural village of Tamkuha in 2007, the power 
flowed from rice husks. Gyanesh Pandey and 
Ratnesh Yadav knew that photovoltaic panels 
were too expensive for their plans, and there 
wasn't a lot of wind blowing through this town 
of roughly 2,000 people. But their home state 


GETTY 


of Bihar has rice in 
abundance. 

Trained in electri- 
cal engineering at 
Rensselaer Polytech- 
nic Institute in Troy, 
New York, Pandey sketched out a plan with his 
long-time friend Yadav. Working with a grant 
of roughly $12,000 from the Indian Ministry of 
New and Renewable Energy, the duo invested 
more than $40,000 of their own money to pur- 
chase and modify a gasifier to turn rice husks 
into biofuel, buy a 32-kilowatt generator, and 
run power lines through the village. 

Within five months, the residents of Tam- 
kuha had enough electricity to charge their 
mobile phones and fend off darkness with two 
compact fluorescent light bulbs per house- 
hold for 6-8 hours a night. Pandey and Yadav 
formed Husk Power Systems with Manoj 
Sinha, who studied business at the University 
of Virginia in Charlottesville, and the company 
now has more than 80 mini-power plants serv- 
ing some 200,000 people in India, Uganda and 
Tanzania. 

Success in Tamkuha proved that even poor 
customers will pay 100 rupees ($1.60) per 
month or more for minimal power, in a coun- 
try where rural households often survive on 
$15-80 a month. The rates are higher than in 
urban centres, but customers typically save 
overall because they purchase less kerosene. In 
2007, says company president Sinha, nobody 
believed that Husk Power could create a via- 
ble business.“But when we scaled up to more 
than 300 villages, people started believing in 
the model” 

The opportunities in India are huge. 
Although the percentage of people without 
access to electricity in 2011 was only 25% — 
much lower than the 80-90% rates seen in 
some African countries — that still left a record 
300 million people without power ina single 
country. The government has invested money 
and attention in the problem, and those efforts 
have slashed the number of people without a 
connection to the main grid by more than half 
in the past decade. But the country is struggling 
to supply enough power to feed all those lines, 
and to hook up the most remote communities. 

Husk Power has become one of the world’s 
largest microgrid developers. And it is dream- 
ing big, targeting 5 million customers within 
five years in India and east Africa. With the 
cost of photovoltaic panels falling, the com- 
pany is building solar microgrids and pairing 
them with storage batteries to meet evening 
demand. And it is experimenting with a solar- 
biomass hybrid power plant intended to pro- 
vide power around the clock. 

“Where Husk is going is very positive,” says 
Daniel Kammen, an energy researcher at the 
University of California, Berkeley. “It is not 
fixating on one technology, it is fixating on 
solutions.” 

But problems may lie ahead. In some areas, 


Around the world, 
more than a billion 
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small providers such as Husk Power compete 
with the expanding central grid, leaving some 
villages with two suppliers. The microgrids 
tend to be more reliable, but are also more 
expensive, because subsidies usually go into 
capital construction costs rather than towards 
keeping electricity rates low. Kammen says 
governments and companies should agree on 
some basic industry standards for regulation 
and finance, so that investment in microgrids 
— the only solution for some areas — expands 
rather than being undermined. 


TOKELAU, SOUTH PACIFIC 
The Sun shone brightly on Tokelau as a cargo 
ship pulled into harbour in June 2012, bring- 
ing the tiny trio of South Pacific islands their 
largest delivery ever. On board were more than 
4,000 solar panels and 1,000 storage batteries, 
as well as innumerable nails and screws. “We 
thought the island was going to sink,” jokes 
energy minister Foua Toloa. The cargo gave 
Tokelau the moral high ground in the battle to 
halt global warming: it has been widely billed 
as the first nation to accomplish a sweeping 
shift from fossil fuels to renewable energy. 

Tokelau, like almost all small island nations, 
used to rely on diesel-powered generators to 
meet the needs of its 1,400 residents. In its 
first full year of operation, the new 1-mega- 
watt solar system met roughly 93% of the 
nation’s electricity demand. Today, Tokelau has 
reduced its annual fuel bill by about $800,000, 
which more than covers payments on the loan 
it received from the government of New Zea- 
land for the microgrid. “We're very proud,” 
Toloa says. “We are challenging the world and 
the big emitters of greenhouse gases to equal 
or better what Tokelau has done” 

A handful of Caribbean islands has signed 


Centralized grids are 
expected to provide 
only about 30% of the 
solution in rural areas. 


up to that challenge with the help of the 
Carbon War Room, a Washington DC-based 
advocacy group founded by British entrepre- 
neur Richard Branson. The Caribbean island 
of Aruba, where wind power currently pro- 
vides 12% of demand, led the way in March 
2012 with a commitment to eliminate fossil- 
fuel use by 2020. But with 109,000 residents 
and regular demand for roughly 100 mega- 
watts of power, Aruba’s challenge is much 
bigger than Tokelau’s. “This is actually a very 
interesting proving ground to test out ambi- 
tious levels of renewables and energy effi- 
ciency,” says Amory Lovins, co-founder of 
the Rocky Mountain Institute in Snowmass, 
Colorado, which co-hosted a clean-energy 
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summit for Caribbean nations with the 
Carbon War Room in February. Lovins notes 
that the lessons learned about balancing energy 
supply and demand could help with the man- 
agement of mainland power, too. Some US 
states, including New York, are exploring ways 
to divide the main grid into electricity ‘islands’ 
that could be isolated in the face of large-scale 
outages. And, Lovins adds, island projects such 
as that on Aruba may help to convince the world 
that reliable power systems can be built almost 
entirely from renewables. 

Tokelau has not yet achieved its goal of going 
100% renewable. The old diesel generators still 
occasionally kick in to charge batteries during 
the rainy season, and many residents rely on 
imported gas for cooking. The nation’s govern- 
ment is planning to help residents to purchase 
more efficient appliances or make the switch 
to electric cookers. Air conditioners, consid- 
ered an unnecessary luxury on the island, have 
already been banned for government use. And 
if the economics work out, as early as next year 
the country hopes to begin producing coco- 
nut oil to power the generators when the Sun 
doesn't shine. “We've got plenty of extra coco- 
nuts,’ Toloa says. 


SINE MOUSSA ABDOU, SENEGAL 

Residents in the village of Sine Moussa Abdou 
once had to trek ten kilometres to a neighbour- 
ing village to charge their mobile phones, pay- 
ing fees as high as $110 per kilowatt hour — the 
average US rate is 12 cents. Those with televi- 
sions hauled a car battery to be recharged. The 
village's 900 residents now pay about $1.40 per 
kilowatt hour for power delivered to their 
homes through a microgrid built in 2009. The 
company in charge of supplying the power — a 
combination of wind, solar and diesel — says 
all of the students in the village school passed 
their annual exam for the first time one year 
after electrification, thanks to having enough 
light at night to study by. 

The project is just one of many seeking to 
solve the massive energy problem in sub-Saha- 
ran Africa, where nearly 600 million people — 
more than two-thirds of the population — lack 
access to electricity (see ‘In the dark’). But itis an 
innovative example of public-private partner- 
ship that many observers are watching closely. 

The project, a partnership of Inensus in Gos- 
lar, Germany, and Matforce, based in Dakar, 
Senegal, was divided into two parts: interna- 
tional grants were used to wire up the village, 
but the power generation and supply is entirely 
unsubsidized. Inensus uses smart meters to 
track customers’ usage, and asks users to pay 
for weeks’ worth of power in advance, offer- 
ing a discounted rate to those who predict and 
commit six months ahead. That information 
helps to keep costs and emissions down by 
ensuring that the wind and solar systems can 
cope, and that the diesel generators are not 
leaned on too heavily; they typically mop up 
the last 10-20% of demand. 
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Nearly 1.3 billion people worldwide, or 19% of the global population, lacked electricity in 2010. That number is 
projected to decline to about 1 billion people, or 12% of the global population, by 2030. More than eight out of 
ten people without access to electricity live in rural areas, making independent microgrids an attractive solution. 


The proportion of people 
without power in India, 
currently 25%, is expected 
to shrink to 10% by 2030. 


Sine Moussa 
Abdou, Senegal 


Combined wind, solar 


/ | . - and diesel microgrid 
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175 remote 
Canadian 
communities 
are off-grid; 
most of these 
rely on diesel. 


Forecasts suggest that Latin 
America will achieve universal 
access by the mid-2020s. 


Although the cost of the power is more than 
three times what urban customers might face, 
the business model is designed to promote sus- 
tainability and flexibility. Nico Peterschmidt, 
managing director of Inensus, foresees a sce- 
nario in which local companies own a grid 
and contract out the supply of power, foster- 
ing competition and freeing up companies and 
communities to shop around. 

The company is now expanding the project 
into five nearby villages and has launched a 
larger project targeting 16 villages and 82,000 
people in Tanzania. Peterschmidt says the 
Tanzanian government has perhaps the most 
advanced microgrid policy in the world, 
including a simple subsidy of $500 per connec- 
tion for grid infrastructure, which covers the 
bulk of up-front costs. The biggest challenge, 
he says, is convincing governments to abandon 
fixed electricity rates that do not allow for com- 
pany profit. “Ifwe can overcome that, we can 
accelerate the private sector to provide energy 
access,’ he says. Kammen notes that there is 
generally enough commercial competition and 
watchdog activity to prevent price abuses, with 
most projects providing electricity at or below 
the price of diesel power. 

The trick, notes Pepukaye Bardouille, an 
energy analyst who tracks microgrids at the 
International Finance Corporation in Wash- 
ington DC, is to balance a nation’s desire to 
attract profitable industry with the wish for 
their poorest people to get electricity. “Are we 
trying to promote commercially viable busi- 
nesses? Or are we trying to promote access at 
any cost?” she asks. “Sometimes those two do 
not overlap.” 

Dean Cooper, an energy-finance specialist 
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Nearly 80% of 
people without 
electricity in the 
Middle East are 
in Yemen. 


The number of 
people without 
power in Africa 

is expected to 
increase owing to 
population growth. 
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at the United Nations Environment Pro- 
gramme (UNEP) in Paris, says that UNEP is 
working on microgrid demonstrations in dif- 
ferent countries to determine which policies 
and models work best. At present, it is too early 
to say which will win out. “All of the business 
models can be scaled up on paper,’ says Bar- 
douille. “In practice, it is harder to deliver” 


ATLIN, CANADA 

For years, the only source of power in Atlin, 
an old mining town of some 400 people in the 
northwest corner of British Columbia, was 
diesel generators. The steady drone and smelly 
fumes were a constant reminder that money was 
going up in smoke, and members of the Taku 
River Tlingit First Nation, who make up 25% of 
the town’s population, were determined to find 
an alternative. After experimenting with a wind 
turbine, which buckled under ice and wind in 
the winter of 2002-03, the tribal band settled on 
a small hydroelectric project. With $15 million 
from grants, community funds and loans, the 
Atlin hydroelectric project began generating 
2.1 megawatts of power on 1 April 2009. 

Atlin is enjoying the benefits. Eliminat- 
ing diesel prevented more than 5,000 tonnes 
of greenhouse-gas emissions last year. And, 
because the First Nation owns the hydroelectric 
station, the money that residents pay for power 
stays at home. “We are paying our loans, but 
there’ a little bit extra that is benefiting the com- 
munity,’ says Stuart Simpson, general manager 
of the Atlin Tlingit Development Corporation. 

There are 175 aboriginal or northern off-grid 
communities in Canada, most of which rely 
on diesel. The Taku River Tlingit First Nation 
was one of the first to switch to home-owned 
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hydropower, and others are looking to follow. 

A new wave of small-scale hydroelectric 
development aiming to supply both off-grid 
and on-grid energy in British Columbia has 
spurred controversy about the potential eco- 
logical impacts of such projects. The Vancou- 
ver-based Wilderness Committee, for one, has 
voiced concerns about possible disturbances 
to grizzly bear habitat and salmon-bearing 
rivers. But the actual effects are hard to tease 
out. A review released in January by the Pacific 
Salmon Foundation, a conservation organiza- 
tion in Vancouver, found no conclusive evi- 
dence about effects on fish. 

Nigel Protter, executive director of the BC 
Sustainable Energy Association, says modern 
hydroelectric projects can, if well-conceived 
and implemented, improve the local ecosys- 
tem; the Atlin project, for example, built a fish 
ladder to help graylings get around the small 
dam, and Simpson says that fish counts have 
increased. The problem, Protter says, is that 
many rural communities in the developed 
world want more power than their small riv- 
ers can provide without the construction of 
dams to store up water. “Storage often creates 
additional environmental and social impacts.” 

Atlin has all the power it needs for now; it is 
even considering expanding its project to tie 
into the main electric grid and to export power 
to the northern Yukon territory. “In 20 years, 
when we have this bank loan paid off, we'll 
have a couple million coming into the com- 
munity each year,’ Simpson says. “This is really 
about our grandkids.” = 


Jeff Tollefson covers climate, energy and the 
environment for Nature. 


IMAGE: CRAIG MAYHEW AND ROBERT SIMMON, NASA GSFC/DATA SOURCE: IEA, WORLD ENERGY OUTLOOK 2013 
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Legend 
of the wolf 


Predators are supposed to exert strong control over 
ecosystems, but nature doesn’t always play by the rules. 


BY EMMA MARRIS 


n the summer of 2008, Kristin Marshall was driving through 
Yellowstone National Park in Wyoming. Marshall, a graduate student 
at the time, had come to the park to study willow shrubs — specifically, 
how much they were being eaten by elk. 

She pulled to the side of the road and was preparing to hike to one of 
her study plots when she ran into two sisters from the Midwest, who were 
touring the park. The women asked what Marshall was doing and she 
said, “Iam a researcher. I am working in that willow patch down there.” 

The tourists gushed: “We watched all about the willows on this nature 
documentary. We hear that all the willows are doing so much better now 
because the wolves are back in the ecosystem.” That stopped Marshall 
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The return of grey wolves 
to the western United 
States has sparked 
debate over their role in 
structuring ecosystems. 
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short. “I didn’t want to say, ‘No, you are wrong, they aren't actually doing 
that well?” 

Instead, she said: “The story is a probably a little more complicated 
than what you saw on the nature documentary.” That was the end of the 
conversation; the tourists seemed uninterested in the more-complicated 
story of how beavers and changes in hydrology might be more impor- 
tant than wolves for willow recovery. “I can't say I blame them,” says 
Marshall, now an ecologist with the US National Oceanic and Atmos- 
pheric Administration in Seattle, Washington. “What you see on TV 
is captivating.” 

On television and in scientific journals, the story of how carnivores 
influence ecosystems has seized imagina- 
tions. From wolves in North America to lions 
in Africa and dingoes in Australia, top preda- 
tors are thought to exert tight control over the 
populations and behaviours of other animals, 
shaping the entire food web down to the veg- 
etation through a ‘trophic cascade’. This story 
is popular in part because it supports calls to 
conserve large carnivores as ‘keystone spe- 
cies’ for whole ecosystems. It also offers the 
promise of a robust rule within ecology, a field in which researchers 
have yearned for more predictive power. 

But several studies in recent years have raised questions about the 
top-predator rule in the high-profile cases of the wolf and the dingo. 
That has led some scientists to suggest that the field’s fascination with 
top predators stems not from their relative importance, but rather from 
society’s interest in the big, the dangerous and the vulnerable. “Predators 
can be important,’ says Oswald Schmitz, an ecologist at Yale University 
in New Haven, Connecticut, “but they aren't a panacea.” 


PREDATORS ON TOP 

In the early years of ecology, predators did not get so much respect. 
Instead, researchers thought that plants were the dominant forces in 
ecosystems. The theory was that photosynthesis from these primary 
producers determined how much energy was available in an area, and 
what could live there. Bottom-up control was all the rage. 

Interest in top-down trophic cascades emerged in 1963, when ecolo- 
gist Robert Paine of the University of Washington in Seattle started 
to exclude predators from study plots at his coastal research site. He 
pried predatory starfish off intertidal rocks and hurled them into deeper 
waters. Without the starfish to control their numbers, mussels eventu- 
ally carpeted the plots and kept limpets and algae from taking hold in 
the region. A new ecosystem emerged (see Nature 493, 286-289; 2013). 

After this and other aquatic studies, the conventional wisdom in the 
field was that top-down trophic cascades happened only in rivers, lakes 
and the sea. An influential 1992 paper’ by Donald Strong at the Univer- 
sity of California, Davis, asked: “Are trophic cascades all wet?” As ifin 
answer, ecologists began looking for similar carnivore stories on land. 

They soon found them. In 2000, a review’ tallied 41 terrestrial studies 
on trophic cascades, most of which showed that predation had signifi- 
cant effects on the number of herbivores in an area, or on plant damage, 
biomass or reproductive output. These studies were all on small plots 
involving small predators: birds, lizards, spiders and lots of ants. 

Research on terrestrial trophic cascades moved to much larger scales 
with the work of John Terborgh and William Ripple. In 2001, Terborgh, 
an ecologist at Duke University in Durham, North Carolina, reported’ on 
dramatic ecosystem changes that came after a dam was built in Venezuela. 
Flooding from the dam created islands that were too small to support big 
predators such as jaguars and harpy eagles. The population densities of 
their prey — rodents, howler monkeys, iguanas and leaf-cutter ants — 
boomed to 10-100 times those on the mainland. 


Seedlings and saplings were devastated. > NATURE.COM 
In the same year, Ripple, an ecologist at Oregon _ Forapodcast on 

State University in Corvallis, published a key _ predators, visit: 

paper* on the most famous, and probably the _ go.nature.com/xInpp3 


“Predators can be 
important, but they 
aren’t a panacea.” 


FEATURE | NEWS 


best-studied, example ofa terrestrial carnivore structuring an ecosys- 
tem: Yellowstone's wolves. The ecosystem offered a natural experiment 
because the US National Park Service had the park’s exterminated 
wolves (Canis lupus) by 1926 and then reinstated them in the 1990s, 
after public sentiment and ecological theory had shifted. In 1995, 
14 wolves from Alberta, Canada, were introduced into the park. Sev- 
enteen from British Columbia followed in 1996. By 2009, there were 
almost 100 wolves in 14 packs in the Yellowstone area. (That number is 
now down to 83 in 10 packs.) 

During the years when there were no wolves, ecologists grew increas- 
ingly worried about the aspen trees (Populus tremuloides) in the park. 
It seemed that intensive browsing by Rocky 
Mountain elk (Cervus elaphus) was preventing 
trees from reaching adult height, or ‘recruit- 
ing. In the early twentieth century, aspen cov- 
ered between 4% and 6% of the winter range 
of the northern Yellowstone herd of elk; by the 
end of the century, they accounted for only 
1% (ref. 4). 

When Ripple and his co-authors checked 
aspen growth against the roaming behaviour 
of wolves in three packs, they found that aspen grew tallest in stream- 
side spots that saw high wolf traffic. That pattern hinted at an indirect 
behavioural cascade: rather than limiting browsing by reducing elk 
populations throughout the park, wolves apparently made elk more 
skittish and less likely to browse in the tightly confined stream valleys, 
where prey have limited escape routes (see “The tangled web’). A 2007 
study” by Ripple and Robert Beschta, also of Oregon State, seemed to 
strengthen the behavioural-cascade hypothesis. It found that the five 
tallest young aspen in stream-side stands where there were downed logs 
—a potential trip hazard for elk — were taller than the five tallest young 
aspen in stands away from streams or without downed logs. 

Similar evidence of indirect wolf effects emerged from a study of wil- 
lows. In 2004, Ripple and Beschta found’ that the shrubs were returning 
in narrow river valleys, where the researchers thought that the chances 
of wolves attacking elk were greatest. 

More recently, Ripple has been documenting the regrowth of cotton- 
wood trees. “When we look around western North America, we see a 
big decrease in tree recruitment after wolves were removed. And when 
wolves returned to Yellowstone, the trees started growing again. It is just 
wonderful to walk through that new cottonwood forest.” 


TALES FROM TREES 
But some ecologists had their doubts. The first major study’ critical of 
the wolf effect appeared in 2010, led by Matthew Kauffman of the Wyo- 
ming Cooperative Fish and Wildlife Research Unit in Laramie. When 
researchers drilled boreholes into more than 200 trees in Yellowstone 
and analysed growth patterns, they found that the recruitment of aspen 
had not ended all at once. Some trees had reached adult size as late as 
1960, long after the wolves had gone. And some stands had stopped 
growing new adults as early as 1892, well before the wolves left. The 
aspen petered out over decades, as elk populations slowly grew, suggest- 
ing that the major influence on the trees is the size of the elk population, 
rather than elk behaviour in response to wolves. And although wolves 
influence elk numbers, many other factors play a part, says Kauffman: 
grizzly bears are increasingly killing elk; droughts deplete elk popula- 
tions; and humans hunt elk that migrate out of the park in winter. 

When Kauffman and his colleagues studied’ aspen in areas where 
risk of attack by wolves was high or low, they obtained results different 
from Ripple’s. Rather than look at the five tallest aspen in each stand, as 
Ripple had done, they tallied the average tree height and used locations 
of elk kills to map the risk of wolf attacks. By these measures, they found 
no differences between trees in high- and low-risk areas. 

Questions have also emerged about the well-publicized relationship 
between wolves and willows. Marshall and two colleagues investigated 
the controls on willow shrubs by examining ten years’ worth of data 
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-The tangled web 


even in areas where elk are most at risk from wolves (3). 


Researchers disagree on whether the return of wolves to Yellowstone National Park (1) sparked a resurgence of aspen trees by limiting browsing by elk. One study 
found that aspen grow better in stream areas with fallen trees (2), where elk may feel most vulnerable to wolves. But another study found that aspen fare poorly 


from open plots and plots surrounded by cages to keep the elk out. 
Her team found’ that the willows were not thriving in all the protected 
sites. The only plants that grew above 2 metres — beyond the reach of 
browsing elk — were those in areas where simulated beaver dams had 
raised the water table. 

If beavers have a key role in helping willows to thrive, as Marshall’s 
study suggests, the shrubs face a tough future because the park's beaver 
populations have dropped. Researchers speculate that the removal of 
wolves in the 1920s allowed elk to eat so much willow that there was 
none left for the beavers, causing an irreversible decline. 

“The predator was gone for at least 70 years,’ says Marshall. “Removing 
it has changed the ecosystem in fundamental ways.” This work suggests 
that wolves did meaningfully structure the Yellowstone ecosystem a cen- 
tury ago, but that reintroducing them cannot restore the old arrangement. 

Arthur Middleton, a Yale ecologist who works on Yellowstone elk, 
says that such studies have disproved the simple version of the trophic 
cascade story. The wolves, elk and vegetation exist in an ecosystem with 
hundreds of other factors, many of which seem to be important, he says. 


DINGO DEBATE 

Another classic example of a trophic cascade has come under attack in 
Australia. The standard story there is that the top predator, the dingo 
(Canis lupus dingo), controls smaller introduced predators such as 
cats and foxes, allowing native marsupials to thrive. But Ben Allen, an 
ecologist at the Department of Agriculture, Fisheries and Forestry in 
Toowoomba, has compared’ areas where dingoes are poisoned with 
areas where they are left alone, and found no difference in marsupial 
abundance. He is quite cynical, he says, about “this idea that top preda- 
tors are wonderful for the environment and will put everything back to 
the Garden of Eden”. 

Allen’s opponents counter that he has failed to show that the poison- 
ing regimens actually reduce dingo population densities. Chris John- 
son, an ecologist the University of Tasmania in Hobart, says he is “very 
critical” of Allen’s experimental design and methods. The dingo effect 
is real, says Johnson. 

Ripple is not worried about these debates, which he views as quib- 
bling over details that do not undermine the overall strength of the 
tropic-cascade hypothesis. In fact, when he published a major review’® 
this year of the effects that predators exert over ecosystems, he left out 
studies critical of the wolf and dingo trophic-cascade theories; he says 
that there was no room for them in the space he had to work with. Ripple 
is particularly concerned with documenting the impacts of Earth's top 
carnivores because so many are endangered. “We are losing these carni- 
vores at the same time that we are learning about their ecological effects,” 
he says. “It is alarming, and this information needs to be brought forth.” 
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Risk of wolf predation 


The debate has been harsh at times, but in quieter moments the differ- 
ent factions all tend to talk in similar terms about the great complexity of 
ecosystems and the likelihood that the truth lies somewhere in the mid- 
dle. James Estes, an ecologist at the University of California, Santa Cruz, 
and one of the fathers of the trophic-cascade idea, says that the evidence 
for cascades mediated by changes in animal behaviour rather than by 
changes in animal number is “thin’, at the moment — and that many of 
the effects that have been documented are spotty and badly need to be 
rigorously mapped out. Still, he adds, “When all is said and done, and 
everyone is dead 100 years from now, Bill [Ripple] will be closer to right”. 

Although Ripple stresses the role of the top carnivores, he agrees they 
are not the end of the story. “I believe in the combination of top-down 
and bottom-up, working in unison,” he says. “They are both playing out 
on any given piece of ground and the challenge will be to discover what 
determines their interactions and relative effects” 

Schmitz has some thoughts on how to do that. His own smaller-scale 
work on invertebrates has convinced him that neither bottom-up nor 
top-down theories adequately capture the story of ecosystems. He is 
starting to look at the middle players, such as elk, beavers and grass- 
eating grasshoppers. These herbivores, he says, integrate influences 
from both the top (such as predation pressure) and the bottom (such as 
the nutritional quality of plants). “It is not really bottom-up or top-down 
but trophic cascades from the middle out,’ he says. “That is where we 
will evolve. It is knowing what the middle guy is going to do that gives 
you the predictive ability.” 

It remains to be seen whether theories such as this middle-out idea 
will grip researchers and the public as much as the theory of top-down 
cascades. Many researchers have doubts. They worry that tales of preda- 
tors shaping their ecosystems are so attractive that they have unrivalled 
control over discourse. “Everyone likes to think of the big wolf or the 
big bear looking after the environment,’ says Allen. “We do love a good 
story.’ m SEE EDITORIAL P.139 


Emma Marris is a freelance writer in Klamath Falls, Oregon. 
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arlier this year we submitted an unusual 
Be to a scientific journal. What is 

unusual about it is not the topic — 
computations of how interactions between 
light and matter in the primordial Universe 
affected large-scale cosmic structures — but 
what inspired it. The paper draws on ideas 
in a medieval manuscript by the thirteenth- 
century English scholar Robert Grosseteste. 

De Luce (On Light), written in 1225 in 
Latin and dense with mathematical thinking, 
explores the nature of matter and the cosmos. 
Four centuries before Isaac Newton proposed 
gravity and seven centuries before the Big 
Bang theory, Grosseteste describes the birth 
of the Universe in an explosion and the crys- 
tallization of matter to form stars and planets 
in aset of nested spheres around Earth. 

To our knowledge, De Luce is the first 
attempt to describe the heavens and Earth 
using a single set of physical laws. Implying, 
probably unrealized by its author, a family of 
ordered universes in an ocean of disordered 
ones, the physics resembles the modern 
‘multiverse’ concept. 

Grosseteste’s treatise was translated and 
interpreted by us as part of an interdiscipli- 
nary project led by Durham University, UK, 
that includes Latinists, philologists, medi- 
eval historians, physicists and cosmologists 
(see ordered-universe.com). Our experience 
shows how science and humanities scholars 
working together can gain fresh perspectives 
in both fields. And Grosseteste’s thesis dem- 
onstrates how advanced natural philosophy 
was in the thirteenth century — it was no 
dark age. 


EARLY INSIGHTS 

Many coffee-table histories of science 
A thirteenth-century depiction of the geocentric cosmos. maintain that the natural philosophy of the 
medieval centuries constituted a scientific 
e dead end — burrowing ever deeper into 
alchemy and astrology. A closer examina- 
| | e 1e \ ad tion reveals a more nuanced story. Preserved 
on vellum manuscripts, written in coded 
e medieval Latin and enveloped in unfamiliar 
metaphysics it may be, but the science of the 
| | | | | ] \ | e T S e twelfth and thirteenth centuries constitutes a 

crucial stage in the history of thought. 
By the late twelfth century, Aristotle’s 


Ideas ina thirteenth-century treatise on the observation-oriented science had burst 
nature of matter still resonate today, say ore ee 

: ted in a long series of cross-cultural transla- 

Tom C. B. McLeish and colleagues. tions from Greek to Arabic to Latin. Great > 
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> questions arose in the minds of scholars 
such as Grosseteste, Averroes (in Cordoba) 
and Gerard of Cremona (in Toledo). What is 
colour? What is light? How does the rainbow 
appear? How was the cosmos formed? We 
should not underestimate the imaginative 
work needed to conceive that these questions 
were, in principle, answerable. 

Grosseteste (c.1175-1253) rose from 
obscure Anglo-Norman origins to become a 
respected theologian and Bishop of Lincoln. 
He was one of the first in northern Europe 
to read the newly translated scientific works 
of Aristotle, attempting to take forward the 
big questions of what we can know about 
the natural world (ontology) and how we 
know it (epistemology). The late thirteenth- 
century philosopher Roger Bacon called 
him “the greatest mathematician” of his 
time. Grosseteste’s work on optical phys- 
ics influenced mathematicians and natural 
philosophers for generations, notably in 
Oxford during the fourteenth century and 
in Prague during the fifteenth. 

Exploring the scientific thought of the thir- 
teenth century is inherently interdisciplinary, 
requiring knowledge of Latin, history and 
philosophy, as well as of mathematics and 
science. Our collaboration at Durham began 
in 2008, following a seminar on Grosseteste 
by one of us, Tom McLeish, a physicist who 
had become interested in the thirteenth- 
century thinkers after hearing talks at Leeds 
University, UK, by historian James Ginther of 
Saint Louis University in Missouri. 

Intrigued, medieval scholars at 
Durham, including Giles Gasper, recruited 
other Grosseteste specialists, including 
Cecilia Panti at the University of Rome Tor 
Vergata, Neil Lewis at Georgetown Uni- 
versity in Washington DC, and the Lati- 
nist Greti Dinkova-Bruun at the Pontifical 
Institute of Mediaeval Studies in Toronto, 
Canada. Before tackling De Luce, we honed 
our skills** on two simpler short works by 
Grosseteste: De Colore, on colour theory, 
and De Iride, on the rainbow, aided by col- 
our psychophysicist Hannah Smithson at 
the University of Oxford, UK, and Durham 
optical physicist Brian Tanner. 


LIGHT WORK 

Grosseteste’s De Luce, available in English 
since the 1940s, opens by addressing a prob- 
lem with classical atomism: why, if atoms 
are point-like, do materials have volume? 
Light is discussed as a medium for filling 
space. Grosseteste’s recognition that mat- 
ter’s bulk and bulk stability requires subtle 
explanation was impressive. Even more 
intriguing was his use of mathematics to 
illuminate his physics. 

A finite volume, he writes, emerges from 
an “infinite multiplication of light” acting on 
infinitesimal matter. He draws an analogy to 
the finite ratio of two infinite sums, claiming 


that (1+2+4+8+...)/(0.5+14+2+4+...)is 
equal to 2. He does not articulate carefully 
the idea of the limits one needs to make this 
rigorous, but we know what he means — 
simultaneously adding to both numerator 
and denominator keeps the ratio finite. 
The third remarkable ingredient of 
De Luce to modern eyes is its universal 
canvas: it suggests that the same physics 
of light and matter that explains the solid- 
ity of ordinary objects can be applied to the 
cosmos as a whole. An initial explosion of 


A simulation of Robert Grosseteste’s 
nine-sphere universe. 


a primordial sort of light, lux, according 
to Grosseteste, expands the Universe into 
an enormous sphere, thinning matter as it 
goes. This sounds, to a twenty-first-century 
reader, like the Big Bang. 

Then Grosseteste makes an assumption: 
matter possesses a minimum density at 
which it becomes ‘perfected’ into a sort of 
crystalline form. Today, we would call this 
a phase transition. The perfection occurs 
first at the thinnest outer edge of the cosmos, 
which crystallizes into the outermost sphere 
of the medieval cosmos. This perfect matter 
radiates inward another sort of light, Jummen, 
which is able to push matter by its radiative 
force, piling it up in front and rarefying it 
behind. An analogous process in today’s 
physics is the inward propagation of shock 
waves in a supernova explosion. 

Like a sonata returning to its theme, that 
finite ratio of infinite sums reappears, this 
time as a ‘quantization condition — a rule 
that permits only discrete solutions such as 
the energy levels in atoms — that limits mat- 
ter to a finite number of spheres. Grosseteste 
needed to account for nine perfect spheres in 
the medieval geocentric cosmos: the ‘firma- 
ment, the fixed stars, Saturn, Jupiter, Mars, 
the Sun, Venus, Mercury and the Moon. By 
requiring that the density is doubled in the 
second sphere and tripled in the third, and 
so on, a nested set of spheres results. 
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In an impressive final stroke of unification, 
he postulates that towards the centre of the 
cosmos, the remaining unperfected matter 
becomes so dense and the inwardly radiat- 
ing Jumen so weak, that no further perfection 
transitions are possible. He thus accounts 
for the Aristotelian distinction between the 
perfect heavens and the imperfect Earth and 
atmosphere. 


MODERN TOOLS 

To our knowledge, De Luce is the first worked 
example showing that a single set of physi- 
cal laws might account for the very different 
structures of the heavens and Earth, hun- 
dreds of years before Newton's 1687 appeal 
to gravity to unite the falling of objects on 
Earth with the orbiting of the Moon. Our 
translation has also cleared up a misconcep- 
tion in some previous studies that the light in 
Grosseteste’s treatise travelled both inwards 
and outwards. 

To explore the consequences of the phys- 
ics in the treatise further, and to urge a 
closer and more careful reading of the text, 
the science team turned to modern tools. 
De Luce is remarkably precise in its formula- 
tion of physics — had Grosseteste had access 
to the mathematics of calculus and the com- 
puting power we have today, it would have 
been natural to apply them. 

We identified six physical ‘laws’ in the 
manuscript, including the interaction of 
light and matter, the critical criteria for per- 
fection, and the re-radiation and absorption 
of lumen. We wrote down these laws math- 
ematically, including modern concepts such 
as opacity, which were consistent with the 
text although not described explicitly in it. 
Then we computed the resulting equations, 
expressed in differential form, in three- 
dimensional spherical symmetry. 

To assess the range of possible solutions to 
these novel equations, and out of curiosity, 
Durham cosmologist Richard Bower then 
computed the space of possible medieval 
universes by varying the values of four para- 
meters: the gradient of the initial ‘Big Bang’ 
matter distribution, the coupling strength of 
light and matter, the opacity of impure matter 
and the transparency of the perfected spheres. 

A rich set of solutions emerged. A narrow 
set of parameters did indeed produce the 
series of celestial perfected spheres and, 
within the Moon’s orbit, a further four spheres 
corresponding to fire, air, water and earth 
— as the medieval world view demanded. 
But most choices of the four parameters 
yielded no spheres, or a disordered mess 
of hundreds of concentric spheres with no 
radial pattern to their densities. Other pos- 
sible model universes contained infinite 
numbers of spheres, some with unbounded 
density. The project had unwittingly 
stumbled on a medieval multiverse. 

The possible existence of more than one 


REF. 1 


universe was indeed a live issue of the period, 
and a highly contentious one — appearing, 
for example, in the Papal edict of 1277 that 
banned a list of scientific teachings. But it 
was a debate that Grosseteste apparently 
chose to avoid. None of his surviving trea- 
tises discusses the possibility of other forms 
of universe, however close he came to imply- 
ing it in his cosmogony. 

Of course we know now, thanks to 
telescope observations from the early sev- 
enteenth century onwards, that a geocentric 
cosmos is untenable. But in 1225, it was the 
simplest theory consistent with the obser- 
vations. Grosseteste’s effort to give a physi- 
cal account of its origin is an impressive 
achievement, but it also reminds us of the 
limitations of our own current cosmological 
theory, with its reliance on intangible factors 
such as ‘dark matter’ and ‘dark energy. 


SCIENTIFIC VALUE 

The translation of De Luce is an exemplar 
of the importance of collaborations between 
the arts and sciences, of thinking and learn- 
ing together in new ways, and a reminder 
that the intellectual tradition we now call 
science has along and rich history. 

Both the scientists and the humanities 
scholars in our collaboration found work- 
ing together enriching and transforma- 
tional; it forced us to engage with different 
ideas and problems. There were challenges: 


getting used to each other’s methodologies 
and approaches took time and patience. 
And our expectations changed. At the start 
of the project we had hoped for a sharper 
understanding of the text; we were surprised 
when new science emerged as well. 

What next for the collaboration? The 
Durham-led team has examined three of 
Grosseteste’s science works in detail so far. 
There are at least another ten to explore, 
including a work on the origin of sounds 
(De Generatione Sonorum). The scientific 
writings of Grosseteste’s immediate prede- 
cessors, Alfred of Sareshull and Alexander 
Neckham, and his successors, including 
Bacon, could hold similar insights into the 
evolution of ideas. 

Funding for such interdisciplinary work, 
however, remains a problem. In the United 
Kingdom, none of the scientific research 
councils offered grants for such a project. 
In the end, we were funded by the Arts and 
Humanities Research Council. The US grant- 
ing system is similarly biased. The European 
Research Council and philanthropic sources, 
such as the Wellcome Trust, do fund science-— 
arts projects, but in our experience it is easier 
to obtain funding for science and social 
science collaborations than for science and 
humanities partnerships. 

Because projects such as ours can be of sig- 
nificant scientific and cultural value, scientific 
granting agencies should consider funding 


arts and sciences projects or partnering with 
arts and humanities councils to translate 
other early scientific works, for example. 

The eight-century journey from Grosse- 
teste’s cosmological ideas to our own offers 
a rich illustration of the slow evolution in 
our understanding, and of the delight to be 
found in reaching out into nature with our 
imagination. m 
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The art of science advice 
to government 


Peter Gluckman, New Zealand’s chief science adviser, offers his ten 
principles for building trust, influence, engagement and independence. 


science adviser to the Prime Minister of 

New Zealand. The week I was appointed 
coincided with the government announce- 
ment that the New Zealand food industry 
would not be required to add folate to flour- 
based products to help to prevent neural- 
tube defects in newborns, despite an earlier 
agreement to do so. As it happens, this is 
an area of my own scientific expertise and, 
before my appointment, I had advised the 
government that folate supplementation 
should occur. But various groups had stirred 
considerable public concern on the matter, 
about health risks and about medicalizing 
the food supply. 


lE 2009, I was appointed as the first 


Thus, in my first media interview as 
science adviser I was asked how I felt about 
my advice not being heeded. I pointed out 
that despite strong scientific evidence to 
support folate supplementation, a demo- 
cratic government could not easily ignore 
overwhelming public concern about the 
food supply. The failure here was not politi- 
cal; rather, it was the lack of sustained and 
effective public engagement by the medical- 
science community on the role of folate in 
the diet. As a result, the intervention did not 
get the social licence necessary to proceed. 

Five years on, I am still in the post. I 
have come to understand that the primary 
functions and greatest challenges for a 
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science adviser are providing advice not 
on straightforward scientific matters, but 
instead on issues that have the hallmarks of 
what has been called post-normal science’. 
These issues are urgent and of high public 
and political concern; the people involved 
hold strong positions based on their values, 
and the science is complex, incomplete and 
uncertain. Diverse meanings and under- 
standings of risks and trade-offs dominate. 
Examples include the eradication of 
exogenous pests in New Zealand’s unique 
ecosystems, offshore oil prospecting, legali- 
zation of recreational psychotropic drugs, 
water quality, family violence, obesity, teen- 
age morbidity and suicide, the ageing > 
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> population, the prioritization of early- 
childhood education, reduction of agri- 
cultural greenhouse gases, and balancing 
economic growth and environmental sus- 
tainability. 

My own experience is of a Westminster- 
style parliamentary democracy in a small 
advanced economy. In this context, I have dis- 
tilleda set of ten principles to guide my work. 
Other countries have 


different forms of gov- “Policy- 
ernment and different makers 
cultural histories of and elected 
public reason; high- officials 

level scientific advice rightly 

may be provided by guard their 
individuals, councils responsibility 
or academies, ora fo define 
combination. Never- policy. ‘ 


theless, I think my 
guidelines are relevant to all those providing 
advice to senior levels of government. 

These principles differ a little from those 
that might guide individual researchers 
and academics in attempting to influence 
policy in areas of their own interest and 
expertise”*. Crucially, science advisers 
are obliged to advise in the context of the 
policy process. This means elucidating the 
evidence-informed options, rather than 
simply advocating a course of action. 


TOP TEN 

Maintain the trust of many. The science 
adviser must sustain in parallel the trust 
of the public, the media, policy-makers, 
politicians and the science community. 
This is especially true in times of crisis and 
is no small challenge. Food-safety pan- 
ics such as foot-and-mouth disease and 
Creutzfeldt-Jakob disease (CJD) catalysed 
a strengthening of the science-advisory sys- 
tem in the United Kingdom, enhancing the 
roles of departmental science advisers. The 
aftermath of the 2011 nuclear meltdown in 
Fukushima is causing Japanese officials to 
take a critical look at advisory practices’. 

In my case, it was an earthquake that 
tested trust. In early 2011, Christchurch in 
New Zealand experienced the second of two 
major earthquakes in six months. The events 
had devastating consequences, including 
nearly 200 deaths, and effectively destroyed 
our second-largest city. This cluster of quakes 
was unusual and led to seismologists pub- 
licly competing in their interpretation of the 
nature of the fault-lines and future risks. This 
confused the public and policy-makers. 

It took considerable dialogue with the 
scientists for them to understand the need 
for simple and consistent communication, 
and to accept that erudite and, in many 
ways, self-serving scholarly discourse did 
not belong on the front page of newspapers 
every day. What was needed was clear com- 
munication of the knowns and unknowns. 
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Worse, because the earthquake happened 
on the day of a full Moon, a popular 
astrologer got primetime coverage when 
he predicted another big earthquake would 
occur one month later when the Moon and 
Sun would next be in alignment. Panic set 
in. We worked with New Zealand's Science 
Media Centre to calm the public while 
acknowledging seismic uncertainty. 


Protect the independence of advice. The 
advisory role should be structured so as to 
protect its independence from both politi- 
cal interference and premature filtering in 
the policy process. There is inevitably a ten- 
sion between such independent advice and 
departmental policy processes, and it takes 
considerable diplomacy to create a trusted 
partnership between an external adviser 
and departmental officials. The terms of 
my appointment protect my independence 
in that I continue to be an employee of my 
academic institution, seconded to the prime 
minister, and my advisership is not tied to 
the electoral cycle. That said, an adviser must 
recognize that publicly disturbing the demo- 
cratic process could mean losing the trust 
of the elected leader and thus any potential 
for influence. So there are issues on which a 
national scientific academy or a panel may be 
best placed to advise or to be seen supporting 
the individual adviser. 


Report to the top. Scientific advice must 
be available directly — uncensored — to 
the head of government or the head of the 
relevant department. Indeed, the questions 
for which advice is most often sought tend 
to be politically sensitive and cut across indi- 
vidual portfolios. 


In New Zealand, for instance, the economic 
importance of land-based primary industries 
must be balanced against maintaining our 
ecosystems and the eco-tourism indus- 
try built on them. These concerns are the 
responsibility of separate ministries, whose 
respective takes on environmental impact 
are inevitably framed by their mandates. The 
adviser’s perspective transcends these. 


Distinguish science for policy from policy 
for science. Science advising is distinct 
from the role of administering the system of 
public funding for science. There is poten- 
tial for perceived conflict of interest and 
consequent loss of influence if the science 
adviser has both roles. There is a risk that 
the adviser comes to be perceived as a lob- 
byist for resources, or that the role becomes 
restricted to the ministry that manages the 
national research funding. Yes, a science 
adviser should have input into science pol- 
icy, but there is a delicate balance to strike. 

Early in my appointment, an unnecessary 
tension was created by media portrayals that 
I, rather than the relevant ministry, was the 
key influence on science policy. This is not so 
— nor did I want it to be — but communica- 
tion with that ministry became strained for 
some time, denting my effectiveness. 


Expect to inform policy, not make it. 
Science advice is about presenting a rigor- 
ous analysis of what we do and do not know. 
Alone, it does not make policy. There are 
many other appropriate inputs to policy, 
including fiscal considerations and public 
opinion. Policy-makers and elected officials 
rightly guard their responsibility to define 
policy — and this means choosing between 


Post-normal science: Christchurch, New Zealand, after its second earthquake in six months. 
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A gas-production station near Mount Taranaki in New Zealand’s North Island. 


options with different trade-offs. This is 
not the domain of a science adviser. Being 
explicit about this” has eased my capacity 
to establish and sustain trust broadly across 
government and the policy community. 


Give science privilege as an input into 
policy. While acknowledging the other 
relevant inputs into policy formation, we 
need to demonstrate why science should 
hold a privileged place among the ‘types 
of knowledge’ that may be meaningful to 
a politician. These include social tradi- 
tion and popular belief. The privilege of 
science-derived knowledge comes from 
its set of standard procedures — for exam- 
ple, replication and peer review — that 
limit the influence of beliefs and dogma. 
The other inputs into policy are value- 
intensive, and rightly so. 


Recognize the limits of science. Science 
can increasingly address complex questions 
over which policy-makers and elected offi- 
cials agonize. But scientists must not over- 
state what is or can be known, even though 
the shift from a view of science as a source 
of certainty to a source of probability can 
frustrate and confuse decision-makers and 
the public. How many politicians or issues 
advocates have claimed that they can find 
a scientist to back any position as, indeed, 
at least one did in the folate debate? This 
attitude reflects the dangerous temptation 
to use science to justify value-based beliefs° 
and a lack of literacy about what science is (a 
process)’. For example, much of the debate 
about climate change is not primarily about 
the data. Rather, it is about intergenerational 
economic interests. 


Act as a broker not an advocate. Trust can 
be earned and maintained only if the science 
adviser or advisory committee acts as a 
knowledge broker, rather than as an advo- 
cate® — often a subtle distinction. When 
formal science advice is perceived as advo- 
cacy, trust in that advice and in the adviser is 
undermined, even if the advice is accepted. 
For example, exaggerated presentations about 
the causes of storms and floods can erode the 
credibility of the underlying argument about 
global warming. My own academic research 
has been focused on the developmental ori- 
gins of obesity, and I have had to be careful 
ensure a balance of advice. Even so, where 
there is strong advocacy for other approaches, 
suspicion about the balance of my advice is 
hard to avoid (see go.nature.com/syxyee). 


Engage the scientific community. The 
science adviser must know how to reach out 
to scientists for the appropriate expertise, and 
help them to enact their social responsibility 
in making their knowledge accessible and 
understandable, and in being more self-aware 
about when they might be acting as advocates. 
These issues are encapsulated in the recently 
updated, groundbreaking Code of Conduct 
for Scientists’, which directly implies a dis- 
tinction between brokerage and advocacy, 
published by the Japanese Council of Science. 


Engage the policy community. The role of 


the science adviser is often less about provid- 
ing direct technical expertise than it is about 
nudging attitudes and practices to enhance 
both the demand for and the supply of evi- 
dence for public policy. 

Why? Because sceptics in the policy com- 
munity are surprisingly prevalent. In 2012, I 
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surveyed how our public-service personnel 
use evidence in making policy’. Several 
ministries stated that their job was to design 
policy that met the minister’s requirements, 
not to advise on policy options on the basis 
of available evidence. Studies in Canada and 
Australia’ found similar results. 


GOOD ADVICE 

These principles that guide my own work 
probably apply to most models of science 
advice. The use of advisers, advisory councils 
or academies need not be mutually exclusive. 
Different approaches suit different purposes 
and are the product of a country’s culture, 
history, political and social structures and 
approaches to civic reason”. 

In my experience, achieving the culture 
change that encourages the better use of 
scientifically derived evidence in govern- 
ment relies ona level of trust that may be best 
achieved by one-to-one relationships with 
senior members of the executive govern- 
ment. In crises, such relationships are essen- 
tial. By contrast, for complex and chronic 
issues, I believe that advisory committees or 
academies have a crucial part to play. 

Happily, these matters are increasingly 
being discussed. In August this year, the first 
global conference of academics and practi- 
tioners of science advice to governments will 
take place in Auckland, New Zealand (see 
www.globalscienceadvice.org). I hope that 
two days of discussion between thought lead- 
ers from around the world about principles, 
methods, tensions and solutions from myriad 
contexts will make an important contribution 
to this rapidly growing field. m 
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Royal Society co-founder John Evelyn, painted in around 1650 by Dutch artist Adriaen Hanneman. 


IN RETROSPECT 


Sylva 


Gabriel Hemery celebrates the 350th anniversary of John 
Evelyn’s treatise on the science and practice of forestry. 


r Three hundred and fifty years ago, 
London’s recently formed Royal 
Society — the body at the heart of the 

Enlightenment — published its first book. 

It was written not by Robert Boyle, Isaac 

Newton or any of the other luminaries of 

seventeenth-century experimental philoso- 

phy, but by another founding member of the 
society: the prodigious public servant John 

Evelyn (1620-1706). And its subject was not 

anatomy, astronomy, chemistry or optics, but 

forestry. Sylva is a practical treatise on silvi- 
culture and an enduring classic, published 
in four editions during Evelyn's lifetime and 

posthumously in a further six, up until 1825. 
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It remained the dominant forestry trea- 
tise in English until the nineteenth century 
and, thanks to its rich language, remains a 
favourite among tree experts. It inspired my 
own forthcoming book with Sarah Simblet, 
The New Sylva (Bloomsbury, 2014; www. 
newsylva.com). 

Sylva was a response to an early venture of 
the Royal Society. Various committees had 
been formed to help to organize experiments 
and produce reports, and one of the first set 
out to respond to the Navy Royal's concerns 
about timber shortages arising from the 
degraded state of the nation’s forests. 

Evelyn took the lead in this venture, 
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Sylva; Or a Discourse 
of Forest-Trees, and 
the Propagation 

of Timber in His 
Majesties Dominions 
JOHN EVELYN 


Royal Society: 1664. 


presenting a paper 
to the society in 
October 1662 that 
was imprinted 
as a book some 
18 months later. 


Evelyn wrote in his 
diary — alongside that of his contemporary 
and friend Samuel Pepys, a record of signifi- 
cant historical importance — on 16 February 
1664 that “I presented my ‘Sylva to the Soci- 
ety; and next day to his Majestie [Charles I], 
to whom it was dedicated; also to the Lord 
Treasurer and the Lord Chancellor”. 

Sylva encouraged the nation’s landowners 
to plant more trees and care for their forests, 
in the interests of the strategic defence of a 
nation reliant on ‘wooden walls’ — that is, 
the navy. It inspired considerable interest in 
tree planting in Britain, both as new forests 
on private estates, and on city streets and in 
formal gardens. Evelyn wrote on everything 
from London smog (Fumifugium, 1661) and 
salad (Acetaria, 1699) to soils (A Philosophi- 
cal Discourse of Earth, 1676), and served 
as Commissioner for the Privy Seal and as 
Treasurer to the Royal Hospital for Seamen 
at Greenwich. But Sylva and his diary com- 
prise his greatest legacy. 


CULTIVATING VARIETY 
As the wellspring of the Enlightenment, the 
seventeenth century witnessed considerable 
botanical discovery and geographical explo- 
ration in the New World and the Far East. 
German naturalist Engelbert Kaempfer, for 
example, was the first European to describe 
the maidenhair tree (Ginkgo biloba), which 
he saw in Japan. Evelyn advocated introduc- 
ing new tree species to Britain, where diversity 
was limited to 60 native species. As he wrote, 
it was important “to promote the culture of 
such plants and trees (especially timber) as 
may yet add to those we find already agree- 
able to our climate in England” (this and other 
quotes taken from the 1776 edition of Sylva). 
Evelyn was born into a family whose 
wealth was founded on gunpowder. He 
attended Balliol College at the University of 
Oxford but never graduated, prevented by 
his father’s ailing health and the rumblings 
of the English Civil War. During the Inter- 
regnum, Evelyn travelled widely in Europe, 
returning to England in 1652 considerably 
better educated in areas such as anatomy. 
Notably, he now had a strong interest in 
horticulture after witnessing continental 
garden design, and had collected specimens 
and seeds of exotic plants. At Sayes Court 
near London, he transformed the garden, 
introducing a European formality merged 
with traditional English informality. In the 
same decade, Evelyn began to write his vast 
gardening treatise Elysium Britannicum, 
which he worked on for much of his life but 
never completed. It was largely down to his 
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prowess as a garden designer and plantsman 
that he was asked to lead the enquiry on the 
state of the nation’s forests and their care. 

Evelyn’s personal motto was omnia 
explorate; meliora retinete (explore every- 
thing; keep the best), and in Sylva he adhered 
closely to this ambition. He described in detail 
the tree species of “greatest use, and the fit- 
test to be cultivated”, dwelling mostly on oak. 
As many as 2,000 oak trees were required for 
each navy ship. Of its wood, he notes that 
“though some trees be harder, as Box, Cornus, 
Ebony, and divers of the Indian woods; yet 
we find them more fragil, and not so well 
qualified to support great incumbencies and 
weights, nor is there any timber more lasting”. 
After oak, he gave ash, elm and pine greatest 
prominence, given their utility in shipbuild- 
ing, construction and everyday life. 

He discusses the natural environment — 
air, soil and water — and tree-nursery and 
forest management, tree diseases, and the 
cultural significance of trees and forests. 
He details how to collect seeds, raise young 
plants, prune (often improving the healing 
of a cut with cow dung) and optimize tim- 
ber use. He relies heavily on the wisdom of 
ancient philosophers, such as Pliny the Elder, 
melded with the contemporary and practical 
silviculture practised by the landed gentry. 
He also includes many medicinal remedies 


— for example, ash for toothache, or box for 
venereal diseases — although he admits that 
“quacking is not my trade; I speak only here as 
a plain husband-man, and a simple forester”. 

Evelyn inspired landowners to plant more 
forest trees, yet such is the lag between vision 
and fruition in forestry that the oak and other 
productive forest species intended for ship- 
building were eventually to support other 
industries, espe- 


“Sylva was a cially as pit props 
response to the for coal mining. 
Navy Royal’s He also sought to 
concerns ensure the protec- 
about timber tion of Britain’s 


forests, but it was 
not until the mid- 
eighteenth century that an Act of Parliament 
offered them formal protection. Despite both 
afforestation and conservation, Britain's for- 
ested area continued to dwindle. It reached an 
all-time low of 5% at the start of the twentieth 
century, and in response the Forestry Com- 
mission was formed in 1919 to coordinate an 
afforestation programme aiming to create a 
strategic reserve of timber for the nation. 
Like many foresters, Evelyn had foresight 
and ambition that echoed well beyond his 
own lifetime. Society has finally come to 
appreciate the functions of forest soils in the 
carbon cycle, the role of the world’s forests in 


shortages.” 
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combating climate change, the importance 
of the world’s forests and their associated 
biodiversity, and the role that trees have in 
maintaining human wellbeing. We are just 
beginning to realize the true potential of 
renewable materials made from woody bio- 
mass. Skyscrapers up to 30 stories high and 
of mass timber construction are being con- 
sidered. Nanocrystalline cellulose made from 
wood pulp — a material stronger than steel 
— is being used to replace synthetic materials, 
such as the plastics in car manufacture and 
conventional ballistic material in bullet-proof 
vests. Evelyn planted the concept of a wood 
culture, but it is maturing only in the early 
twenty-first century. 

Balancing our demand for nature’s 
wonder-material with the need to protect 
Earth from our industry, to grow food for 
our ever-increasing population, and to 
address the problems posed by pests and 
pathogens spread by global trade, presents 
an enormous challenge. The delightful prose 
and practical advice in Sylva continue to 
inspire 350 years on. m 


Gabriel Hemery is a silvologist, author and 
photographer. He is chief executive of the 
Sylva Foundation. His first book, The New 
Sylva, will be published in April. 

e-mail: gabriel@sylva.org.uk 


13 MARCH 2014 | VOL 507 | NATURE | 167 
© 2014 Macmillan Publishers Limited. All rights reserved 


PARTICLE PHYSICS 


Higgs on the big screen 


Alexandra Witze savours a behind-the-scenes look at physics’s most famous arrival. 


ow could there possibly be anything 
H fresh to say about the Higgs boson, 

the subatomic particle whose 2012 
discovery sparked a Nobel prize, a slew of 
popular books, an exhibition and even a 
zombie movie and a rap song? Remarkably, 
physicist-turned-filmmaker Mark Levinson 
pulls it off ina documentary about the Higgs 
shot where the particle was discovered. 

It is hard to get distinctive footage when 
camera crews have been crawling over the 
Large Hadron Collider (LHC) at CERN, 
Europe’s particle-physics laboratory near 
Geneva, Switzerland, for years. Levinson’s 
edge is that he filmed on and off between 
2008, when the LHC launched, and 2012. The 
Particle Fever team shot almost everything 
that counted: from the champagne-popping 
celebrations at the first beam of circulating 
protons, to the dirty and disfigured wrecks 
of the superconducting magnets that blew a 
week later, crippling the machine for months. 

The film's other fresh take is its choice of 
protagonists. You wont hear much about the 
superstar researchers. Rather, the characters 
are everyday experimentalists and theoreti- 
cians caught up in the race to discovery. One 
of the most endearing is Monica Dunford, a 
talkative US postdoc who is forever throwing 
on a hard hat and taking a wrench to bits of 
the LHC that are not working. Hearing that 
one of the machine’s detectors is five sto- 
reys tall is one thing — watching Dunford’s 
face light up as she takes it all in is another. 
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Another key figure is CERN’s Fabiola 
Gianotti, the former would-be philosopher 
who became spokeswoman for one of the two 
major Higgs-hunting experiments in 2009. 

A parallel storyline follows a handful of 
theoretical physicists waiting for the results. 
The most eloquent is David Kaplan, the 
physicist who produced the film. It is Kaplan 
who handles the explainers on the Higgs 
during an academic lecture that efficiently 
dispenses scientific background. And it is 
Kaplan who films himself driving in the 
middle of the night to a party in Prince- 
ton, New Jersey, to watch the unveiling of 
the particle, missing a highway exit in his 
excitement. But Nima Arkani-Hamed steals 
the show on several levels. Slightly distracted 
and ever garbed in a red-and-black striped 
rugby shirt, he talks about his family escap- 
ing revolutionary Iran, and the solace that 
physics provided. Pacing late at night at the 
Institute for Advanced Study in Princeton, 
Arkani-Hamed is a kind of physics every- 
man, standing in for all the scientists strain- 
ing to catch the news from the LHC. 

The film’s only stumble is a segment involv- 
ing controversial ideas about whether many 
universes might co-exist — the multiverse. 
As data start to flow back from the LHC, tan- 
talizing hints of the 
Higgs emerge, but 
the particle’s mass 


Particle Fever 
DIRECTED BY MARK 


aa ta LEVINSON 
1S initially unclear. — On limited cinema release 
The film pits two from 5 March 2014. 
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possible Higgs masses against one another, 
each representing one possible explanation 
— either that the multiverse exists, or that 
every particle in the Universe has a shadowy 
‘supersymmetric partner. This strategy pro- 
vides tension but overemphasizes the possi- 
bility that a particular Higgs mass supports 
the multiverse idea. It might have been better 
to play up the competition between the two 
main LHC instrument teams. This drama is 
not apparent in Particle Fever, even in the cli- 
mactic scene in which Gianotti and her coun- 
terpart on the second experiment unveil the 
Higgs discovery at an unforgettable seminar. 

That is a quibble, however. The sense of 
scientific drive is palpable throughout the 
film, and even the coverage of the discovery 
seminar is fresh. Rather than chase the CERN 
director around the auditorium where the 
announcement was made, Levinson shows 
us Savas Dimopolous, a theorist at Stanford 
University in California, trying to talk his way 
past the guards at the auditorium door to be 
present for the historic moment. We see Dun- 
ford huddled in front of a computer screen, 
eyes fixed on the video feed as Gianotti 
announces that the Higgs has been found. 

And we see a baby erupt into tears in the 
hallway outside the CERN auditorium — 
like the Higgs itself, a noisy newborn and a 
bringer of joy. m 


Alexandra Witze is a correspondent for 
Nature in Boulder, Colorado. 


CLAUDIA MARCELLONI/CERN 
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China must publicize 
its emissions reports 


As China is the world’s largest 
energy consumer and carbon 
dioxide emitter, the future 
trajectory of its carbon emissions 
will play a crucial part in global 
mitigation plans. However, 
China’s National Assessment 
Reports on Climate Change, 
issued in 2007 and 2011, are 
limited in scope and not widely 
disseminated or cited. These 
shortcomings must be rectified 
before the next report is released 
this year. 

Compared with the assessment 
reports by the Intergovernmental 
Panel on Climate Change (IPCC), 
China’s have had scarcely any 
impact. The IPCC report is cited 
thousands of times in United 
Nations official documents, 
whereas China’s seem to be barely 
mentioned even in Chinese- 
government documents. 

In our view, this low impact 
reflects the general lack of public 
interest in climate change in 
China, the paucity of media 
coverage and scholarly study, 
and the insufficient efforts by 
central and local government to 
reduce energy consumption and 
greenhouse-gas emissions. 

China's next assessment 
report needs to be more widely 
promoted, include more high- 
quality research results, and 
objectively evaluate current 
policies to tackle climate change 
in the country. China should 
learn from the IPCC and 
open the way to international 
collaboration in preparing 
and promoting the country’s 
assessment reports. 

Yuan-Feng Wang, Yu-Rong 
Zhang Beijing Jiaotong 
University, China. 
cyfwang@bjtu.edu.cn 


Biomedicine must 
look beyond P values 


Establishing statistical validity 
for study findings goes beyond a 
consideration of P values alone 
(R. Nuzzo Nature 506, 150-152; 


2014). In the era of big data, 
we now have many biological 
measures available for assessing 
how likely findings are to be true 
positives. 

This more-comprehensive 
approach has long been used 
by epidemiologists to address 
concerns about bias and causality: 
for example, in investigations 
of possible components of 
hypothetical disease-causing 
pathways (L. H. Kuller et al. Am. 
J. Epidemiol. 178, 1350-1354; 
2013). A way of inferring a causal 
association is to apply Hill’s 
criteria, which seek ties between 
many factors, such as dose 
response, temporality and disease 
exposure (A. B. Hill Proc. R. Soc. 
Med. 58, 295-300; 1965). 

Advances in genomics and 
systems biology enhance our 
capacity for such investigations. 
We can now determine 
whether findings operate ina 
specific genotype context or fit 
biologically plausible pathways or 
networks — as was done in 
a re-evaluation of results from a 
genome-wide association 
study for multiple sclerosis 
(International Multiple Sclerosis 
Genetics Consortium Am. J. 
Hum. Genet. 92, 854-865; 2013). 
Anne-Louise Ponsonby Murdoch 
Childrens Research Institute, 
Parkville, Victoria, Australia. 
anne-louise.ponsonby@mcri. 
edu.au 
Terence Dwyer International 
Agency for Research on Cancer, 
Lyon, France. 


Road maps of no use 
to some physicists 


We do not believe that it is feasible 
for a single organization to draw 
up a ‘road map’ for future light- 
and neutron-source facilities 
in the way that CERN does for 
the particle-physics community 
(see P. G. Radaelli Nature 505, 
607-609; 2014). The diversity of 
the communities that use these 
facilities makes the centralization 
of scientific priorities impossible. 
A range of research fields 
will benefit from Europe's 


investments in the X-ray Free 
Electron Laser and the European 
Spallation Source (ESS), and 
hundreds of scientists are 
collaborating to define ESS’s 
capabilities and instrumentation 
(see go.nature.com/ip6afc). 

Cassandras abound at the start 
of any large project that pushes 
the technological envelope, but 
the ESS is expected to boost 
scientific performance for 
neutron studies by as much as 
300-fold, opening up entirely 
new fields of science. 

Because the ESS forms 
part of a wider network of 
neutron and light sources, the 
discussion needed in Europe is 
how to leverage and integrate 
these sources most effectively 
to improve our research and 
economic environment. 
Aleksandar Matic Chalmers 
University of Technology, 
Gothenburg, Sweden. 
Peter Boni Technische 
Universitat Miinchen, Germany. 
Adrian Goldman University of 
Leeds, UK. 
a.goldman@leeds.ac.uk 


End education 
meddling in Nepal 


The Nepalese government's 
backtracking last month on its 
political patronage of academia 
(Nature 506, 279; 2014) raises 
hopes that appointments to 

top university positions will 
soon be made by non-political 
committees of scholars. Such a 
move would ensure the selection 
of senior academics on merit, 
rather than political affiliation; 
these individuals would help to 
attract much-needed funding for 
education and research. 

The Nepalese government 
needs to create an educational 
system that is free of political 
meddling and nepotism. The 
country’s young scientists will 
return from abroad only when 
a proper infrastructure is in 
place that will enable them 
to implement their skills and 
to realize their ambitions in a 
merit-based society. 


Kosh P. Neupane Tufts 
University, Medford, 
Massachusetts, USA. 
koshalnp@hotmail.com 


Fight floods on a 
global scale 


Accurate digital elevation models 
(DEMs) created using airborne 
lidar have transformed regional 
flood modelling and forecasting. 
At continental and global scales, 
however, the best-available DEMs 
come from satellite images and 
are too crude for simulating 
flooding — and its related risks 
to public health, biogeochemical 
cycling and wetland ecology. 

We would like to see industry, 
governments and humanitarian 
agencies come together to 
support the development of 

a global DEM with higher 
resolution and accuracy. 

Current global DEMs cannot 
resolve the detail of terrain 
features that control flooding. 
More-effective flood-hazard 
maps could be created by 
obtaining high-resolution stereo 
images from satellites, combined 
with the latest advances in flood 
modelling using supercomputers. 
By 2050, worldwide annual losses 
due to flooding are predicted to 
reach US$1 trillion (S. Hallegatte 
et al. Nature Clim. Change 3, 
802-806; 2013). A global-scale 
DEM would have an enormous 
impact on finance (such as flood 
re-insurance), humanitarian 
services (such as disaster relief) 
and scientific research. 

The advanced global DEM 
would use existing lidar data and 
stereo satellite images, and new 
lidar elevation data would be 
acquired on board disaster-relief 
aircraft or on drones deployed 
over flood plains. The operation 
costs would therefore be 
substantially cheaper than most 
satellite missions. 

Guy J.-P. Schumann’ NASA Jet 
Propulsion Laboratory, Pasadena, 
California, USA. 
guy.j.schumann@jpl.nasa.gov 
*On behalf of 4 co-signatories; see 
go.nature.com/j1pchz for full list. 
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Air pollution and forest water use 


ARISING FROM T. F. Keenan et al. Nature 499, 324-327 (2013) 


Forests in North America and northern Europe increased their water- 
use efficiency (WUE)—the ratio of photosynthetic CO, uptake to water 
loss through evapotranspiration—over the last two decades, according 
toa recent Letter’. Keenan et al. attribute the rising WUE to fertilization 
by increasing levels of atmospheric CO, (ref. 1), although biosphere 
models predict this effect to be much smaller than the observed trend. 
Here, I show that falling concentrations of ozone and other phytotoxic 
air pollutants, which were not considered in ref. 1, may explain part of 
the WUE trend. Future efforts to reconcile biosphere models with field 
data should, therefore, use integrated modelling approaches that include 
both air quality and CO, effects on forest growth and water use. There 
is a Reply to this Brief Communication Arising by Keenan, T. F. et al. 
Nature 507, http://dx.doi.org/10.1038/nature13114 (2014). 

Tree injuries caused by ozone, the most phytotoxic air pollutant— 
including visible foliar injury, reduced photosynthesis and diminished 
biomass—depress global ecosystem productivity’ and are well docu- 
mented in field observations from North America and Europe**. Ozone 
enters leaves through stomata and causes internal oxidative stress and 
membrane damage that reduce photosynthetic CO assimilation®*. 
During ozone injury, transpiration usually falls less than does pho- 
tosynthesis, but transpiration can sometimes rise because of ozone 
injury to stomata®®. In either case WUE declines. 

Surface ozone concentrations during the summer growing season 
have fallen significantly in eastern North America and modestly in 
northern Europe owing to emission controls on vehicles and indus- 
trial sources of ozone precursors”'°. Figure 1 shows ozone trends in 
regions around the rural forest sites analysed in ref. 1, evaluated as 
summer daytime-mean mole fraction (Fig. 1a) and as the accumulated 
concentration over a threshold of 40 nmol mol! (AOT4O, defined as 
in the literature’’’), which is a common predictor for plant injury 
(Fig. 1b). I calculated both ozone metrics using only rural sites—from 
the US Clean Air Status and Trends Network (CASTNET; http://epa.gov/ 
castnet) and the European Monitoring and Evaluation Programme 
(EMEP; http://www.nilu.no/projects/ccc/emepdata.html)—reporting 
at least 14 years of hourly ozone data during the period 1995-2010 
(Fig. 1c and d). By either metric, ozone significantly decreased at all 
sites in the midwestern USA (n = 11, P< 0.001-0.02 for Kendall’s t 
test) and northeastern USA (n = 5, P = 0.001-0.004). For averages 
over all sites within each region, AOT40 fell by half in the period 
1995-2010 in both regions (P< 0.002). Over northern Europe most 
sites had negative trends, but with smaller magnitudes, consistent with 
other recent analyses’®. 

The first-order effect of these ozone trends in the Midwest, using 
sensitivities for broad-leaf trees®!*"*, would be a 0.6% annual increase 
in biomass accumulation and a 0.3% annual improvement in WUE. In 
addition, partial closure of stomata in response to rising CO (ref. 14) 
and rising vapour pressure deficit’ reduces leaf uptake of ozone by 
approximately 0.9% per year regardless of ozone trends. Combining 
all these effects, improvements in ozone air quality over the period 
1995-2010 probably increased forest WUE by approximately 0.33% 
per year in the midwestern USA and slightly less in the northeastern 
USA. Using the range of ozone sensitivities reported for tree species'*'*"*, 
the ozone effect on WUE in the midwestern USA could be 0.1-0.8% 
per year. This predicted ozone effect is about one-sixth of the observed 
WUE trend (2% per year, calculated from the Supplementary Information 
to ref. 1) and larger than the mean simulated effect of CO, fertilization 
in the terrestrial biosphere models surveyed by ref. 1. Measuring ozone 
mole fractions and fluxes into the forest canopy simultaneously with 
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Figure 1 | Trends in ozone exposure metrics that correlate with tree injury. 
a, Daytime-mean ozone mole fraction; b, AOT40. Both metrics are calculated 
in April-September of each year during the hours 8:00-20:00 (local time) at 
rural sites in the USA (c) and Europe (d) near forest stations that monitor 
WUE. Lines show the mean trends (Sen’s method) averaged across all stations 
within each region (+1 standard error, P values from Student’s t-test). The 
unusually high mole fraction and AOT40 values in Europe in 2003 and 2006 
were caused by extreme heatwaves. 


WUE should constrain the effect, but the variability of WUE trends 
across sites and years illustrates that ozone data from multiple forests 
and many years are necessary to obtain robust results. In addition to the 
decline in ozone concentration, the concentrations of the air pollutants 
NO, and SO), which also harm WUE both individually and through 
synergistic effects with ozone, have fallen quickly but the effects are not 
included here''. Thus, the benefits of improved air quality to forest 
productivity and WUE may be larger than I have estimated. Keenan et al.’ 
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suggest that current terrestrial biosphere models underestimate the 
impact of CO, fertilization on WUE. The calculations here show that 
ozone trends help to reconcile the large differences between models 
and observations. 


Methods 


I calculated photosynthesis reductions from ozone AOT40 trends (—0.8 parts per 
million (p.p.m.) hours per year, where 1 p.p.m. = 1 pmol mol! for the midwes- 
tern USA) using empirical correlations with ozone exposure for young broad-leaf 
trees (—0.7% per p.p.m. hour, for beech, birch and maple)'*’’. Other tree species 
may be more (poplar) or less (conifers, oak) sensitive: — 1.8% to —0.2% per p.p.m. 
hour (refs 12 and 15). Ozone-induced WUE changes are half those of photosyn- 
thesis and of the same sign’. Rising CO, (2 p.p.m. per year) and rising vapour 
pressure deficit (11 Pa per year; ref. 1) reduced stomatal conductance and ozone 
uptake by approximately 0.4% per year and 0.5% per year, respectively, based on 
empirical sensitivity factors'* (conductance changes are — 0.2% per p.p.m. of CO 
and —0.05% per Pa). 


Christopher D. Holmes? 

1Department of Earth System Science University of California, Irvine, 
California 92697, USA. 

email: cdholmes@uci.edu 
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REPLYING TO C. D. Holmes Nature 507, http://dx.doi.org/10.1038/nature13113 (2014) 


Forests have become more efficient at using water over the past two 
decades’. A series of hypotheses exist to explain this trend, but the only 
credible explanation to date is a response to rising atmospheric COp. 
Keenan et al.’ show that the observed trend is physiologically plausible, 
but is much larger than expected from conventional theory and experi- 
mental evidence. This has led to suggestions that processes other than 
increased atmospheric CO, may have contributed to the observed trend”. 
One such process that has yet to be examined is the effect of tropo- 
spheric ozone on forest water-use efficiency (WUE). In the accompany- 
ing Comment’, Holmes reports that ozone concentrations have declined 
in the northeastern and midwestern USA by about 50% from 1995 to 
2010. Using empirical relationships, he estimates that this decline could 
explain roughly 15% of the reported increase in WUE over North America, 
and a significantly lower proportion of the trend in Europe. 

As a preliminary test of the ‘ozone hypothesis’, we analyse 20 years 
of ozone concentration measurements at Harvard forest, in Massa- 
chusetts, USA, which were made concurrently with the carbon and 
water fluxes from which we derived WUE. In agreement with results 
presented by Holmes’, extreme ozone concentrations have declined at 
this forest over the past two decades. Although the 50th percentile of 
ozone levels has stayed relatively constant, both the 95th percentile 
and the AOT40 metric show declining trends over the time period 
(P = 0.09 and 0.11, respectively; Kendall’s t test, Fig. 1a). 

Despite the declining trend, we found no significant (P = 0.46, 
r = —0.19) relationship between annual means of WUE and the occur- 
rence of high ozone concentrations (Fig. 1b). The observations at Har- 
vard forest, therefore, do not support the claim that WUE is being affected 


E2 | NATURE | VOL 507 | 13 MARCH 2014 


by changes in ozone concentrations. That said, we acknowledge that it 
would be difficult to detect and attribute a change responsible for 15% 
of the trend we observe given the large influence of other factors. 

As an additional test of the ozone hypothesis, we consider the trends 
of the component fluxes of WUE, published in ref. 1. Decreasing ozone 
concentrations are primarily expected to increase leaf photosynthesis, 
with stomatal conductance typically increasing to a lesser extent’, although 
conductance responses vary*. We would therefore expect a similar 
response of photosynthesis and conductance in the data of ref. 1. In 
ref. 1, we report increasing photosynthesis at only 50% of the sites, whereas 
stomatal conductance showed large declines consistently across all sites. 

It is worth noting that global ozone concentrations are increasing’. 
There is no significant trend in ozone concentrations in Europe’, where 
half of the sites used in ref. 1 are located. Within the USA, trends vary 
greatly by region. Ozone in the western USA has increased over the 
past two decades owing to increased levels of precursors from Asia”®. 
Globally, ozone concentrations show large declines only in the mid- 
western and eastern USA®”, where 8 of our 21 sites are based. We there- 
fore agree that it is possible, following the calculations of Holmes’, that 
changes in ozone concentrations contributed to a small proportion 
(roughly 15%) of the trend at those sites. 

In conclusion, we can neither reject nor accept the ozone hypothesis 
of Holmes’, although it is clear that the observed changes in WUE are 
not primarily driven by changes in ozone concentrations. His estimate? 
of an ozone-induced increase in WUE of 0.33% per year is probably an 
upper bound for the response at sites in the USA, but an overestima- 
tion of the response at European sites. That said, the changes in ozone 
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Figure 1 | Ozone and WUE at Harvard Forest. a, Long-term trends in ozone 
concentrations (p.p.b., parts per billion; p.p.m., parts per million) and WUE at 
Harvard forest, with trends (lines) estimated using the Sen-slope method. 


concentrations reported by Holmes’ are large in some heavily forested 
and productive regions, and may beneficially affect ecosystem function. 
We therefore agree with Holmes’ that more work is needed to assess the 
impact of air quality on ecosystems globally, and the resulting change in 
ecosystem function. 
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b, The relationships between variability in critical summer ozone 
concentrations (AOT40) and variability in mean summer WUE values at 
Harvard forest from 1992 to 2010. 
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EVOLUTIONARY BIOLOGY 


Sex, lies and butterflies 


Variation in an evolutionarily conserved sexual -differentiation gene, doublesex, has been found to explain how females of 
one species of butterfly mimic the colour patterns of several toxic species to avoid predation. SEE LETTER P.229 
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Figure 1 | Female-specific mimicry. Males of the swallowtail butterfly species Papilio polytes exist in one form, but several female forms co-occur in the same 
population. Females with the cyrus form look like males, whereas other female forms mimic the colour patterns of distantly related toxic species, such as the 


polytes form, which resembles Pachliopta aristolochiae. 


DAVID W. LOEHLIN & SEAN B. CARROLL 


iological mimicry, in which one species 
B gains an advantage by closely resembling 

another, unrelated species, is one of the 
most spectacular phenomena in nature. Expert 
impostors such as the cuckoo, milk snake 
and bee orchid have long fascinated natural- 
ists and have played an important part in the 
development of evolutionary theory. Indeed, 
in the years immediately after the publication 
of Charles Darwin's On the Origin of Species, 
Henry Walter Bates’ and Alfred Russel Wallace” 
recognized that butterfly mimicry was an obvi- 
ous adaptation that could be explained only 
by natural selection. For the next 150 years, 
however, the mechanisms that generate these 
striking patterns eluded biologists. 

But no longer — there has been a burst of 
breakthroughs” in this long-standing mys- 
tery. On page 229 of this issue, Kunte et al.” 
reveal that the remarkable ability of females 
of a swallowtail butterfly species to closely 
match the colour patterns of several unrelated 
butterflies is due to variation at a single genetic 
region: the butterfly version of the well-studied 
doublesex regulatory gene. 

In the classic case of ‘Batesian’ mimicry’, the 
warning colours of unpalatable or toxic butter- 
flies are co-opted by non-toxic free-rider 
species. Among some swallowtail butter- 
flies (genus Papilio), Wallace described an 
intriguing twist in which mimicry is limited to 


females’. Further studies showed that several 
discrete female forms, each resembling the 
warning colour pattern of a different toxic 
butterfly, often co-occur in a population 
alongside non-mimetic females and males® 
(Fig. 1). Although it is still not known why 
one species can maintain several differ- 
ent mimetic and non-mimetic patterns, 
the inheritance of this variation is surpris- 
ingly simple’: female colour patterns stay 
intact in genetic crosses within, but not 
between, populations, with each pattern 
assorting as one of two possible variants from 
a single genetic locus®. 

Kunte et al. bring swallowtail Batesian mim- 
icry into the molecular era by showing that the 
differences between female mimetic forms 
in Papilio polytes are tightly associated with 
genetic variation around the doublesex locus. 
This gene is a particularly satisfying explana- 
tion for the evolution of sex-specific colour pat- 
terns, because genes of the Dmrt family (which 
includes doublesex) control aspects of sexual 
differentiation in most animals’. The doublesex 
gene basically acts as a switch. Specifically, in 
the best-studied doublesex gene (that of the 
fruit fly Drosophila melanogaster), different 
RNA transcripts are produced in males and 
females by a process known as alternative splic- 
ing. The male and female transcripts encode 
distinct protein isoforms that are thought to 
activate or repress different sets of genes, lead- 
ing to sex-specific differentiation”. 
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How could intricate wing-pattern variation 
derive from this single genetic signal? In 
principle, different female wing-pattern gene 
variants could derive from mutations that 
alter doublesex transcription, splicing or pro- 
tein structure. Kunte and colleagues report 
that the swallowtail doublesex transcripts are 
also alternatively spliced in different sexes, but 
they find no evidence for splicing differences 
between mimetic forms. Rather, they find sev- 
eral mutations in protein-coding sequences, 
and speculate that these could alter the struc- 
ture and activity of the Doublesex protein. 

However, the authors also make the 
intriguing observation that colour stripes in 
the forewings of mimetic females are accom- 
panied by a striped pattern of Doublesex 
expression. This raises the strong possibil- 
ity that changes to this pattern of double- 
sex expression are the cause of the different 
mimetic forms. This inference is grounded 
in insight into the mechanics of the dou- 
blesex gene in other insects. Specifically, 
rather than signalling ‘male’ or ‘female’ in 
every cell, doublesex is elaborately regulated 
and active in only certain cell populations, 
including those that make structures that 
differ between the sexes*”. Indeed, evolution- 
ary changes to regulatory sequences of the 
doublesex locus have reshaped the wings of 
male wasps’’, and shifts in doublesex expres- 
sion have changed the position of sexually 
dimorphic structures in flies’. Therefore, 
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broadening of doublesex expression in the 
swallowtail to a different part of the wing 
might be sufficient to expand a pre-existing 
female-specific colour pattern and generate 
anew mimetic form that could then persist 
owing to the selective advantage it conveys. 
The intricate patterns of doublesex expres- 
sion also help to explain how such apparently 
complex morphological variation could map 
to a single genetic locus. The mimicry loci 
in P. polytes and other butterflies have been 
proposed to be ‘supergenes’ — linked clus- 
ters of trait-altering genes — because of the 
complexity of the colour pattern and the rare 
occurrence of individuals with mixed mimetic 
patterns’. Like other developmental regulatory 
genes, doublesex probably has multiple tran- 
scription-regulating elements (enhancers). 
In principle, different elements could control 
doublesex expression in different parts of the 
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swallowtail wing, and genetic variation at two 
elements should occasionally separate when 
chromosomes cross over during gamete for- 
mation. It is possible that other supergenes are 
also well-known developmental regulatory 
genes that have accumulated multiple func- 
tional mutations in evolution. 

By accomplishing the arduous task of gene 
mapping in a butterfly, Kunte et al. open the 
door to understanding the mechanics of how 
the insect’s mimetic pattern is generated and 
how each wing variant is maintained in a 
population. Identifying the precise molecular 
mechanism behind this spectacular mimicry 
switch promises to be exciting, whether it is 
due to regulatory mutations, protein altera- 
tions or a combination of the two, especially 
in light of the central role of Dmrt genes 
in sexual dimorphism across the animal 
kingdom. = 


Cosmic lens reveals 
spinning black hole 


The power of a cosmic lens to magnify and split the light from a distant, 
mass-accreting giant black hole into four components has allowed researchers 
to measure the black hole’s spin. SEE LETTER P.207 


GUIDO RISALITI 


uasars are the most powerful, con- 
tinuously emitting sources of radia- 
tion in the Universe. They reside at 
the centre of a small fraction of galaxies, and 
are powered by supermassive black holes, 
which have masses millions to billions times 
greater than that of the Sun. Although giant 
black holes are present in most — possibly 
all — galaxies, not all of them are in an active 
state, in which they accrete gas from a sur- 
rounding disk. In fact, most of these objects 
are in a quiescent phase. It is the active type 
of supermassive black hole that drives qua- 
sars. The formation history of supermassive 
black holes is thought to be closely tied to that 
of their host galaxies, but how exactly they 
form and grow remains unclear. In this issue, 
Reis et al.' (page 207) describe how a cosmic 
lens has enabled them to find that a super- 
massive black hole powering a distant quasar 
has grown through coherent, rather than 
random, episodes of mass accretion. 
Astronomers believe that supermassive 
black holes formed in the early Universe from 
small ‘seeds’ with masses of up to 10,000 solar 
masses. These seeds would have then grown 
to reach millions to billions of solar masses 
either through multiple mergers during galaxy 


collisions or through gas accretion from their 
host galaxies; this accretion would have con- 
sisted either of many short, unrelated accre- 
tion episodes or of fewer, longer and ordered 
accretion phases. Different models of galaxy 
evolution predict a different mix of these pro- 
cesses, so reconstructing the formation history 
of giant black holes would provide a way for us 
to understand how galaxies evolved. 

Supermassive black holes are simple systems. 
They are characterized by just two quantities, 
their mass and their angular momentum (spin). 
Whereas the total amount of accretion and any 
mergers that a supermassive black hole under- 
goes are encoded in its mass, how this mass was 
assembled is encoded in its spin’. A few ordered 
accretion events or mergers of large black holes 
produce high spins, and short, random accre- 
tion processes produce low spins. Measuring 
these spins is therefore a major goal of extra- 
galactic astronomy: the spins of supermassive 
black holes hold a key to understanding the 
evolution of their host galaxies. 

But howcan we measure the spins? According 
to Einstein's general theory of relativity, a black 
hole’s gravitational field twists space-time 
around it. Such twisting depends on the black 
hole’s spin, so measuring the twisting allows the 
spin to be estimated. The signature of space- 
time distortion is imprinted on the emission of 
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radiation from regions close to the black holes 
event horizon — the surface beyond which 
no radiation can escape. In quasars, the bulk 
of the huge, observed luminosity is emitted 
by the accretion disk at optical and ultraviolet 
wavelengths. However, this primary emission 
is nearly featureless, so, despite its vicinity to 
the event horizon, it does not provide an easy 
means to detect space-time distortions. The 
best way to perform such a measurement is to 
observe X-rays reflected by the disk. 

The main source of X-ray emission in 
quasars is believed to be a compact cloud of 
hot electrons in the inner part of the black 
hole’s accretion disk. Some of this radiation 
illuminates the accretion disk and is reflected 
towards the observer’s line of sight. This 
reflected emission usually accounts for less 
than 1% of the total energy produced by qua- 
sars, but contains narrow spectral features — 
most notably, an iron spectral line at the object's 


Figure 1 | A quadruple quasar. Reis and colleagues 
analysis’ ofa distant quasar, whose light is magnified 
and split into four components (a-d) by the 
gravitational field of a foreground galaxy (central 
object), has enabled them to calculate the spin of the 
supermassive black hole that powers the quasar. 
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rest-frame energy of 6.4-7 kiloelectronvolts 
— the shape of which is strongly altered by the 
space-time warping around the black hole*>. 
The shape of these features can be measured 
in high-quality X-ray spectra, providing a 
measurement of the spins of supermassive 
black holes**. 

This type of analysis is at the heart of Reis 
and colleagues’ work. Until now, astronomers 
have struggled to obtain unambiguous spin 
measurements using this method. The X-ray 
spectra of active galactic nuclei are quite com- 
plex, and the reflection component, which 
contains the signatures of space-time distor- 
tions, is relatively weak. Moreover, certain 
absorption features can mimic the distor- 
tions’. As a result, only long observations of 
a few very bright sources in the local Universe 
made with the most powerful X-ray observa- 
tories — NASA’s Chandra, Europe’s XMM- 
Newton, Japan's Suzaku and, more recently, 
NASA's NuSTAR — have provided convinc- 
ing results®*. In their study, Reis et al. break 
new ground by obtaining a spin measurement 
of a quasar at a distance of more than 6 billion 
light years from Earth, from a time when the 
Universe was about half its current age. 

This remarkable result was possible owing to 
the exceptional nature of the observed source 
— a quadruply imaged, gravitationally lensed 
quasar (Fig. 1). The light from the distant quasar 
is both magnified and split into four different 
images by the gravitational field of a foreground 
elliptical galaxy (the lens) that, by chance, is on 
the line of sight of the quasar. For this reason, 
the authors could analyse four ‘copies’ of the 
X-ray spectrum of the quasar, each with an 
intensity significantly magnified by the lens. 
The resulting X-ray spectra have a quality that 
matches the best that has been obtained for 
nearby sources, and allowed a robust measure- 
ment of the black hole’ spin. As it turns out, the 
spin is large (close to the highest possible value 
that theory predicts), suggesting that the black 
hole acquired its mass through coherent phases 
of mass accretion. 

Although X-ray spectra of a quality compa- 
rable to that obtained here cannot be currently 
obtained for standard, non-lensed sources at 
similar distances, Reis and colleagues have 
opened a window on what astronomers could 
observe with the next generation of X-ray tele- 
scopes, such as Europe’s proposed ATHENA 
mission. 
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water reservoir 


A tiny sample of a mineral included in a diamond confirms predictions from 
high-pressure laboratory experiments that a water reservoir comparable in size 
to all the oceans combined is hidden deep in Earth’s mantle. SEE LETTER P.221 


HANS KEPPLER 


ow well do we know what lies deep 
His Earth, more than 500 kilo- 

metres below our feet? Surprisingly 
well, according to a paper by Pearson et al.’ 
on page 221 of this issue. The authors describe 
the first sample of an unusual mineral from 
Earth’s mantle transition zone, which is located 
between depths of 410 and 660 km. Their study 
suggests that the sample is rich in water, sup- 
porting the hypothesis that the transition zone 
is a hydrous region. 

Observations of earthquakes show that the 
velocity of seismic waves changes abruptly 
at some discontinuities that separate the 
upper mantle from the transition zone and 
the lower mantle. Geologist Alfred E. Ring- 
wood pioneered the idea that these discon- 
tinuities are due to phase changes of the 
mineral olivine — (Mg,Fe),SiO, — which 
makes up most of the upper mantle. Some 
of these changes result in high-pressure 
phases, including one with a crystal structure 
known as spinel structure’. When spinel- 
structure olivine was subsequently found in 
meteorites that had experienced high shock 
pressures during collisions in space, this poly- 
morph was fittingly named ringwoodite’. 

Until now, nobody had ever seen ringwood- 
ite from Earth’s mantle, although geophysi- 
cists were sure that it must exist. Most people 
(including me) never expected to see sucha 
sample. Samples from the transition zone and 
lower mantle are exceedingly rare and are only 
found in a few, unusual diamonds. But even 
inside a diamond, the decrease in ambient 
pressure as the diamond rises towards Earth's 
surface should normally cause ringwoodite 
to convert back into olivine. However, in the 
diamond studied by Pearson and colleagues, 
ringwoodite remained in its original structure. 
This is an amazing finding, and suggests that 
the transport of the diamond to the surface 
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Figure 1 | A view of the deep Earth. This 
illustration by Edouard Riou, from Jules Verne’s 
novel Journey to the Centre of the Earth (1864), 
shows how the French novelist imagined an ocean 
in Earths interior. 


must have been extremely rapid — possibly 
caused by an explosive volcanic eruption that 
was fed directly by magma produced in the 
transition zone. 

For a long time, it was believed that all the 
water that might once have been in Earth’s 
interior was released from it by volcanic erup- 
tion over geological time and now resides in 
the oceans. This view changed after a study* 
pointed out that wadsleyite — another high- 
pressure polymorph of olivine first synthesized 
by Ringwood’s group* — has an unusual crys- 
tal structure that should make it a potential 
phase for storing water. This prediction was 
later confirmed in many experimental studies, 
and one investigation® not only found plenty of 
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Ringwoodite 
inclusion 


Figure 2 | A diamond with a ringwoodite inclusion. Pearson et al.' have discovered a microscopic 
sample of ringwoodite, a polymorph of the mineral olivine, in this diamond from Juina, Brazil. The 
diamond is 5 millimetres across in its longest dimension. 


water in wadsleyite produced under high pres- 
sure in the laboratory, but also discovered up to 
2 weight per cent water in ringwoodite. Water 
in ringwoodite? This sounded implausible 
because the mineral has a spinel structure, and 
naturally occurring spinels do not tolerate any 
water in the form of hydroxyl (OH) groups. 
However, subsequent studies confirmed that 
ringwoodite has a high capacity to store water, 
similar to wadsleyite. Negatively charged Mg”* 
vacancies in the structure seem to be charge 
balanced by the attachment of protons (H* 
ions) to oxygen atoms. The OH groups created 
in this way represent ‘water’ that is chemically 
dissolved in the crystal structure. 

The fact that plenty of water can be dissolved 
in ringwoodite in high-pressure experiments 
does not, of course, necessarily mean that 
ringwoodite in the mantle also contains water. 
But, because water solubility in ringwoodite 
is much higher than in other minerals, such 
as olivine, thermodynamics predicts that the 
transition zone, where ringwoodite is stable, 
should be greatly enriched in water compared 
with the upper mantle’. This idea of a hydrous 
transition zone has excited Earth scientists 
over many years, and various attempts* ”” have 
been made to infer water content from obser- 
vations such as remote sensing of electrical 
conductivities in the transition zone and the 
precise depth of seismic discontinuities. But 
the results have been mixed, partly because any 
property of the mantle that can be measured 
will depend not only on the water content of 
the minerals present, but also on other para- 
meters — most notably, temperature. It would 
be nice to have a sample of a mineral from the 
transition zone for which the water content 
could be measured in the laboratory. 

Pearson and colleagues’ study provides just 
sucha sample. The authors find that the infra- 
red spectra of their sample in the wavelength 
region where the OH groups absorb this radia- 
tion are strikingly similar to those of synthetic 


samples with water content of about 1 wt%, 
near the 2 wt% upper limit discussed earlier. 
If this sample were representative of the entire 
depth range of the lower part of the transition 
zone, between 520 and 660 km, where ring- 
woodite is stable, it would translate to a total 
of 1.4x 10"! kg of water — about the same as 
the mass of all the world’s oceans combined. 
In some ways, it is an ocean in Earth's interior, 
as visualized by Jules Verne in his 1864 novel 
Journey to the Centre of the Earth (Fig. 1), 
although not in the form of liquid water, but 
as OH groups in an unusual mineral. 
Although Pearson and co-workers con- 
firmed the occurrence, and the high water 
content, of ringwoodite in a diamond (Fig. 2) 
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from the transition zone, itis not entirely clear 
how representative that sample is of the whole 
transition zone. Diamonds are usually brought 
to the surface by ‘kimberlite’ magmas, which 
feed extremely explosive volcanic eruptions”. 
No such kimberlite eruption has ever been 
recorded, but there is evidence that kimberlite 
magma is extremely rich in volatile compo- 
nents such as water and carbon dioxide, and 
probably taps an unusually water-rich part 
of the upper mantle somewhere above the 
transition zone. Ifthe source of the kimberlite 
magma is an unusual mantle reservoir, there is 
the possibility that, at other places in the transi- 
tion zone, ringwoodite contains less water than 
the sample found by Pearson and colleagues. 
However, in light of this sample, models with 
anhydrous or water-poor transition zones 
seem rather unlikely. m 
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G-quadruplex poses 
quadruple threat 


Multiplication of repetitive DNA sequences is often the cause of neurodegenerative 
diseases. A four-stranded structure has been found to form in one such expansion 
in the gene C9orf72, altering gene function in four ways. SEE ARTICLE P.195 


J. PAUL TAYLOR 


he most common cause of familial and 

sporadic forms of the neurodegenerative 

diseases amyotrophic lateral sclerosis 
and frontotemporal dementia is abnormal 
expansion of a repeated six-nucleotide DNA 
sequence’. The repeated sequence is located 
in a non-protein-coding region of the gene 
C9orf72, in which two nucleotides, guanine (G) 
and cytosine (C), recur in GGGGCC repeating 
units. But how expansion of the repeat leads to 


disease is unknown. On page 195 of this issue, 
Haeusler et al.* show that DNA and RNA mol- 
ecules comprised of GGGGCC repeats adopt 
a peculiar secondary structure that could 
account for several of the pathological features 
found in C9orf72-related diseases. 

In addition to the double helix described 
by James Watson and Francis Crick (in which 
two DNA strands align, G pairing with C and 
adenine pairing with thymine), a nucleic-acid 
strand that is rich in G can fold in on itself to 
form a four-stranded secondary structure 
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called a G-quadruplex*. To adopt this topology, 
four G bases associate through atypical pair- 
ing, forming a square planar structure calleda 
G-quartet, and two or more G-quartets stack 
on top of one another to form a G-quadruplex 
(Fig. 1). Haeusler and colleagues used several 
approaches to characterize the secondary 
structures formed by the expanded GGGGCC 
repeats in C90rf72-related amyotrophic lateral 
sclerosis (ALS) and frontotemporal dementia. 
In addition to G-quadruplex formation by 
GGGGCC-repeating RNA, which has already 
been shown for these diseases”*, they found 
stable G-quadruplex formation in DNA with 
this repeat. More importantly, they found that 
enhanced G-quadruplex formation in RNA 
and DNA has consequences for normal expres- 
sion of C9orf72. 

Sequences that can form G-quadruplexes 
have been conserved throughout evolution, 
and are enriched in functional locations such 
as transcriptional start sites, at which RNA 
polymerase, the enzyme responsible for gene 
transcription, attaches and begins to transcribe 
DNA. This positioning suggests a functional 
role for these structures in vivo. A flurry of data 
indicates that G-quadruplex assembly influ- 
ences diverse cellular processes, including 
transcription, translation and RNA localiza- 
tion’. In the context of the C90rf72 GGGGCC 
expansion, Haeusler and co-workers found 
that G-quadruplex assembly in DNA causes 
increased pauses in transcription in the 
expanded repeat region, and that this pausing 
could prevent normal elongation of transcripts 
to full length. 

Furthermore, Haeusler et al. determined 
that G-quadruplexes in C90rf72 DNA pro- 
mote the formation of stable R-loops — triple- 
stranded nucleic-acid structures that assemble 
when a newly formed RNA transcript exiting 
the RNA polymerase invades the DNA double 
helix and binds to one DNA strand, displacing 
the other in the process. R-loops often form 
as a normal by-product of transcription, but 
protective mechanisms are in place to remove 
them. If R-loops are not resolved, however, 
they can halt transcriptional elongation, and 
pose a threat to efficient gene expression’. 

Haeusler and co-workers have therefore 
identified two distinct mechanisms, increased 
pausing and R-loop formation, whereby 
G-quadruplex assembly in C9orf72 could 
cause abortive transcription, producing short, 
repeat-containing transcripts but reducing 
overall production of the C9orf72 protein. 
Consistent with these findings, the authors 
observed that abortive GGGGCC-contain- 
ing RNAs accumulated in the spinal cord 
and the motor cortex region of the brain of 
patients with C9orf72-expanded ALS. These 
results potentially resolve the paradox of how 
GGGGCC-containing RNA foci — a hallmark 
of neurodegeneration, consisting of aggregates 
of RNA that form secondary structures and 
bind cellular proteins — could accumulate in 
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Figure 1 | G-quadruplex formation in nucleic- 
acid strands. G-quaduplexes form in guanine 
(G)-rich regions of DNA or RNA. G bases bind 
through atypical pairing to form square planar 
structures called G-quartets, which can stack 

to form a G-quadruplex. In DNA, secondary 
structures also form in the complementary strand. 
Haeusler et al.’ report G-quadruplex formation 
in DNA when a repeated six-nucleotide sequence 
expands in the gene C9orf72, as observed in 
patients with certain neurodegenerative diseases. 
This causes abortive transcription, resulting in 
accumulation of short, repeat-containing RNA 
transcripts that also form G-quadruplexes. 


patients with GGGGCC-repeat expansion, 
despite a reported reduction in overall levels 
of C9orf72 messenger RNA'”. The findings 
might also partly explain why, in studies’”” 
that relied on various methods to monitor 
different regions of the transcript, reduced 
C9orf72 mRNA levels were inconsistently 
found in cell lines derived from patients with 
C9orf72-related disease. 

There has been tremendous interest in 
identifying the RNA-binding proteins that 
associate with C9orf72 transcripts contain- 
ing expanded GGGGCC repeats. Interaction 
between the RNA G-quadruplex and key RNA- 
binding proteins might sequester the proteins, 
preventing them from functioning normally. 
This could contribute to pathogenesis, analo- 
gous to the functional depletion of splicing 
factors (RNA-binding proteins that excise 
non-protein-coding segments of immature 
RNA following transcription) in the multi- 
systemic disease myotonic dystrophy”. So 
far, a few dozen proteins that bind GGGGCC 
repeats in vitro have been identified, a subset 
of which co-localize in RNA foci in cells from 
patients with C9orf72-related diseases'*'"*"". 
Haeusler and colleagues have taken a leap for- 
ward not only by performing a comprehensive, 
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quantitative assessment of proteins that bind 
C9orf72 transcripts, but also by determining 
the distinct RNA topologies favoured by these 
proteins. This enormously valuable catalogue 
will propel efforts to determine the role of indi- 
vidual RNA-binding proteins in disease. 

A highlight of Haeusler and co-workers’ 
analyses is the finding that the protein nucleolin 
interacts with RNA G-quadruplexes formed by 
C9orf72, consistent with a previous report that 
nucleolin can bind G-quadruplexes formed in 
other genes’®. Nucleolin is a multifunctional 
RNA-binding protein found in the nucleolus, 
a structure in the cell nucleus. Among other 
roles, nucleolin is required for the nucleolus’ 
main function: assembling the ribosome’®, 
the cellular machinery responsible for trans- 
lation. The authors observed that nucleolin 
is mislocalized to RNA foci in neurons of the 
motor cortex of patients with C9orf72-related 
disease. In concert with this abnormal location 
of nucleolin, the authors found signs that the 
nucleolus was unable to produce mature ribo- 
somes normally, culminating in the build-up 
of untranslated mRNA in the cytoplasm — one 
possible reason for pathology. 

C9orf72-related neurodegeneration could 
result from two modes of toxicity: loss of func- 
tion of C9orf72 or gain of function mediated 
by its product (or both). It will be crucial to elu- 
cidate the relative contribution of each of these 
two modes, because this will guide strategies 
for therapeutic intervention. In particular, 
it will be imperative to determine the role of 
reduced C9orf72 expression in disease, as well 
as the consequences of further decreases in 
levels of the C9orf72 protein (efforts are under 
way to silence this gene, for which no function 
is known) to prevent the accumulation of RNA 
foci'*'"”, Relevant to this, two groups recently 
reported that reduction or loss of C9orf72 in 
zebrafish resulted in the degeneration of motor 
neurons'*””. Although depletion of C9orf72 in 
the central nervous system of mice” or in cul- 
tured human motor neurons’”’’”’ has not been 
associated with toxicity, it will be vital to assess 
the consequences of complete C90rf72 loss in 
mammals. 

G-quadruplex assembly by C9orf72 might 
also cause toxic gain of function, not only by 
generating abortive GGGGCC-containing 
transcripts, but also because of the specific 
properties of G-quadruplex RNA. For exam- 
ple, G-quadruplexes can form either within a 
single RNA molecule or between two differ- 
ent RNA molecules, and this latter conforma- 
tion might drive assembly of disease-related 
RNA foci. Moreover, the secondary struc- 
tures of the RNAs that accrue in these foci 
could potentially influence the spectrum of 
sequestered proteins, shaping the resultant 
pathology. Finally, G-quadruplex formation 
could, through an unknown mechanism, pro- 
mote the production of toxic dipeptides from 
GGGGCC-containing RNA transcripts’. Thus, 
G-quadruplexes of C9orf72 in both DNA and 


RNA might simultaneously promote loss of 
normal function and gain of toxic function, 
posing a quadruple threat. = 
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Ordered randomness 
in fly love songs 


A systematic and painstaking analysis reveals that much of the complexity and 
variability of the courtship song of male fruit flies can be accounted for by simple 
rules that relate sensory experience to motor output. SEE LETTER P.233 


BENCE P. OLVECZKY 


ell-crafted love songs can be the 
ticket to fun times and reproductive 
success, whether you are a member 
of the Beatles or one of the many animals that 
woo their mates by singing. Although some 
troubadours serve up monotonous repetitions 
of stereotyped songs, most animals, including 
birds, mammals and insects, like to jazz things 
up by varying their song patterns. 
But how the brain generates such 
variability, and improvisation more A 
generally, remains largely a mystery. 
On page 233, Coen et al.' shed light 
on this issue by showing that much 
of the variability in the love songs of 
fruit flies can be predicted from the 
singers’ movements. 
Neuroscientists’ fascination with 
the sex life of the fruit fly Droso- 
phila melanogaster began more 
than 35 years ago with the discovery 
of fruitless, a gene essential for the 
male courtship ritual’. This unique 
handle on a complex social behav- 
iour, in an organism amenable to 
genetic modification, paved the way 
for an exceedingly detailed ana- b 
tomical mapping of the underlying 
neural circuitry”*. Deciphering the 
details of what these circuits do and 
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One thing that we know these circuits do 
is transform their male owners into mini 
Casanovas. On encountering a receptive vir- 
gin female, a male fly will gently tap her rear 
end, serenade her with a ‘song’ by vibrat- 
ing one of his wings, and lick her genitalia’. 
Although these behaviours are part of any 
self-respecting fly’s lovemaking repertoire, the 
duration and ordering of the different court- 
ship elements can be highly unpredictable. 


siphon 


determining what they can teach us 
about brain function more broadly 
are major challenges that would be 
greatly helped by having a compre- 
hensive description of the computa- 
tions that the circuits perform and 
the behaviours they implement. 


Sine song Pulse song 


Figure 1 | The male fruit fly’s serenade. a, Male fruit flies attract females 
by vibrating one of their wings. b, The fly has two distinct song types — the 
humming sine song and the purring pulse song — and switches between 
them to generate variable song sequences. Coen et al.' found that these 
switches can be predicted by the fly’s movements. (Data depicted in b 
taken from Fig. 6 of ref. 11.) 


© 2014 Macmillan Publishers Limited. All rights reserved 


13 MARCH 


NEWS & VIEWS | RESEARCH | 


13.Mori, K. et al. Acta Neuropathol. (Berl.) 125, 
413-423 (2013). 

14.Lee, Y.-B. et al. Cell Rep. 5, 1178-1186 (2013). 

15.Xu, Z. et al. Proc. Natl Acad. Sci. USA 110, 7778-7783 
(2013). 

16.Abdelmohsen, K. & Gorospe, M. RNA Biol. 9, 
799-808 (2012). 

17.Lagier-Tourenne, C. et al. Proc. Natl Acad. Sci. USA 
110, E4530-E4539 (2013). 

18.Therrien, M., Rouleau, G. A., Dion, P. A. & Parker, A. 
PLoS ONE 8, e83450 (2013). 

19.Ciura, S. et al. Ann. Neurol. 74, 180-187 (2013). 


This article was published online on 5 March 2014, 


What gives rise to such seemingly random 
behaviour? Is the variability due to stochastic 
fluctuations in the underlying neural networks 
(neural noise)®”’, or the result of a dynamic 
sensory experience? 

To address these questions, Coen and col- 
leagues focused on the male fly’s song, itself a 
variable sequence of distinct elements”. Just as 
the Beatles made a career of mixing ‘love; ‘you, 
‘me; ‘she’ and ‘baby’ in different ways, so male 
fruit flies switch between ‘sine’ and ‘pulse’ songs 
to impress their audience (Fig. 1). By eaves- 
dropping on more than 100,000 love songs 
while carefully monitoring the whereabouts 
of the courting couple, the authors suggest that 
a logic and order exist in the apparent musical 
randomness. 

Coen et al. performed a statistical analysis 
of their high-resolution behavioural data, 
and found that transitions between sine and 
pulse songs can be predicted from the courted 
female's movements. The authors further dis- 
covered that the male’ visual experience of the 
female shapes his song through neural circuits 
that control locomotion. In fact, the 
best predictor of song structure is 
not the female’s movements, but the 
singer’s own. Even blind flies, who 
are induced to sing by the scent of 
virgin females, show a bias in their 
song transitions that can be pre- 
dicted from their movements. The 
picture that emerges from all of this 
is one in which the male fly executes 
a tightly integrated song-and-dance 
number, inspired by (if he can see 
her) his partner’s movements. 

As impressive as that may be, the 
extent to which the female cares 
about the details of her lover's intri- 
cate performance remains unclear. 
Does she use information embed- 
ded in his song pattern to determine 
his desirability? Does his ability to 
couple changes in his song to body 
movements — his or hers — cor- 
relate with other qualities that she 
would want in a mate? In other 
words, is song patterning an exam- 
ple of a carefully tuned signalling 
system, or does it reflect a coupling 
between leg and wing movements 
that evolved for unrelated reasons? 
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50 Years Ago 


Recent investigations have shown 
that the fluoride content of Greek 
teeth from the cities of Athens and 
Salonika was considerably high. This 
may explain, at least in part, the low 
prevalence of dental caries observed 
in Greece ... With the exception 

of sea salt, however, the fluoride 
content of other foods commonly 
produced and consumed in Greece 
is not known ... The analyses 
showed that the fluoride content of 
olive oil from the Island of Crete 
was 0.36 p.p.m. and that from the 
area of Kalamai 0.63 p.p.m... 

it appears that the inclusion of olive 
oil in the daily Greek diet does not 
make any significant contribution 
to the amount of ingested fluoride. 
Thus, at present, sea salt remains an 
important source of dietary fluoride 
in Greece for protection against 
dental caries. This may well be the 
case in other countries, such as 
Taiwan, Ceylon and Lebanon, where 
because of local food customs the 
amount of sea salt consumed has 
been estimated to be considerable: 
about 16-20 g per person per day. 
From Nature 14 March 1964 


100 Years Ago 


Think of the Niagaras of speech 
pouring silently through the New 
York telephone exchanges where 
they are sorted out, given anew 
direction, and delivered audibly 
perhaps a thousand miles away. 
New York has 450,000 instruments 
— twice the number of those 

in London. Los Angeles has a 
telephone to every four inhabitants 
... Our whole social structure has 
been reorganised. We have been 
brought together in a single parlour 
for conversation and to conduct 
affairs, because the American 
Telephone and Telegraph company 
spends annually for research ... a 
sum greater than the total income 
of many universities. 

From Nature 12 March 1914 


Initial experiments to address these ques- 
tions have failed to provide clear answers. 
Coen et al. show that song transitions are 
similar whether or not the singer is ultimately 
successful in mating. Yet pheromone-insensi- 
tive males, who sing for normal durations but 
have altered song patterning’, tend to be slower 
and less successful in convincing females to 
mate’*. Whether these flies are handicapped 
in the courting game because of a defect in 
how they vary their songs, or because of unre- 
lated effects, remains to be seen. But whether 
song patterning matters to females or not, we 
now know that its variability, and probably the 
variability of many other ‘fixed’ behaviours, 
is not simply the consequence of noise in 
nervous-system function®”’. Rather, a sizeable 
fraction of that variability is likely to reflect 
computations performed by reliable and pre- 
dictable brains on an ever-changing sensory 
environment. 

Importantly, this insight was made possible 
by simultaneously observing, at high temporal 
resolution, the sensory environment and behav- 
ioural output of a genetically tractable organ- 
ism during a complex social interaction. Such 
detailed analysis applied to natural behaviours 
has the power, as Coen et al. aptly demonstrate, 
to distil seemingly complex and unpredictable 
behavioural patterns into simple rules and sen- 
sorimotor transformations””®. With such an 
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approach, rather than being the fog that pre- 
vents us from understanding nervous-system 
function, behavioural variability and complex- 
ity can be the searchlight that helps us to identify 
the computational problems that brains evolved 
to solve. m 
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Speciation undone 


Hybridization can cause two species to fuse into a single population. New 
observations suggest that two species of Darwin’s finches are hybridizing on a 
Galapagos island, and that a third one has disappeared through interbreeding. 


PETER R. GRANT & B. ROSEMARY GRANT 


he process of speciation, in which one 

species splits into two, is vulnerable to 

collapse in its early stages through inter- 
breeding and the exchange of genes, a process 
referred to as introgression. As explained 
by the evolutionary biologist Theodosius 
Dobzhansky', “Introgressive hybridization 
may, then, be a passing stage in the process 
of species formation. On the other hand, the 
adaptive value of hybrids may be as high as 
that of their parent; introgressive hybridiza- 
tion may lead to obliteration of the differences 
between the incipient species and their fusion 
into a single variable one, thus undoing the 
result of the previous divergent development” 
Writing in American Naturalist, Kleindorfer 
et al.” offer a possible example of this process, 
in a study suggesting that one population of 
Darwin's finches has become extinct through 
interbreeding with another. 
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Until Kleindorfer and colleagues’ report, 
three species of tree finch were known to occur 
together in the highlands of Floreana Island 
in the Galapagos (Fig. 1). They differ in body 
size and in the size and shape of the beak, but, 
unlike many birds elsewhere, not in plumage. 
The medium tree finch (Camarhynchus 
pauper) is present only on Floreana, whereas 
the small tree finch (Camarhynchus parvulus) 
and large tree finch (Camarhynchus psittacula) 
also occur together on several other islands. 
The pattern of distribution and size differences 
led evolutionary biologist David Lack to sug- 
gest’ that speciation had occurred on Floreana 
through the invasion of large tree finches from 
Isabela Island, followed by evolutionary reduc- 
tion in average size. The resulting medium tree 
finches did not interbreed with the large tree 
finches that arrived later, apparently from 
Santa Cruz Island. 

Kleindorfer and colleagues now report that 
this pattern no longer exists: the large tree finch 


A-C: P.R. GRANT & B. R. GRANT 


has disappeared from Floreana! By comparing 
the morphological features of present-day Flo- 
reana finches (studied in 2005 and 2010) with 
historical data, and conducting a genetic study 
of current populations using DNA-sequence 
markers (microsatellites), the authors show that 
there are currently only two distinct popula- 
tions on the island, corresponding to the small 
and medium tree finches. The analyses also 
revealed that individuals that do not fit into 
either population show intermediate char- 
acteristics, suggesting that they are hybrids. 
Consistent with the hypothesis of ongo- 
ing hybridization on the island, the authors 
observed females of the morphologically larger 
group (the medium tree finch) pairing with 
males of the smaller group, and they identified 
15% of yearling males in 2010 as hybrids. 

The authors suggest that hybridization may 
have been responsible for the disappearance 
of the large tree finch from Floreana, and that 
it may now be causing the remaining two spe- 
cies to fuse into one: speciation in reverse’. 
What has brought this about? The most likely 
answer is anthropogenic change to the habi- 
tat. A human settlement was established on 
the island just before Darwin's visit in 1835. 
The natural vegetation subsequently became 
rapidly degraded, and by the end of the nine- 
teenth century two species of finch and a spe- 
cies of mockingbird had become extinct’. 
The large tree finch was rare: only 4 male 
and 13 female specimens were collected for 
museums between 1852 and 1906. The birds 
may have experienced difficulty in finding 
mates of their own species, hybridized with 
medium ground finches and become absorbed 
into the population””. 

Alternatively, the large tree finch may have 
become extinct through changes in the food 
supply alone, without any interbreeding. 
One way of distinguishing between the two 
hypotheses might be to use molecular mark- 
ers to search for evidence of past introgression. 
If markers could be identified in the genomes 
of large tree finches on Santa Cruz (and the 
museum specimens from Floreana) but not 
on Isabela, and also found in the medium 
tree finches, they could be the smoking gun 
of introgression. 

To identify hybrids between the remaining 
medium and small tree finches, Kleindorfer 
et al. relied on a clustering technique with 
(acknowledged) low statistical power. But two 
other identifying clues are at hand. The first is 
beak shape, which is known to be a marker of 
species identity in these finches**. The second is 
song. Different species of Darwin's finches sing 
different songs, which are acquired through 
learning by nestlings and fledglings and used 
by adults to identify mates®”. A song of a large 
tree finch was tape-recorded on Floreana in 
January 1962°, but we failed to see or hear any 
of these birds on five visits during 1979-2004". 
The song of a large tree finch coming from 
the mouth of a medium tree finch could bea 
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Figure 1 | Invasion, evolution and loss. Three species of Darwin’s finches, the small tree finch 
Camarhynchus parvulus (a), the medium tree finch Camarhynchus pauper (b) and the large tree finch 
Camarhynchus psittacula (c), have been known to inhabit the Galapagos island of Floreana. The medium 
finch occurs nowhere else in the archipelago, and its morphological distinctiveness was interpreted by 
evolutionary biologist David Lack to be the result of invasion of a small form of C. psittacula from 

Isabela (1), followed by an evolutionary reduction in size and change in beak shape”. Later, a larger form 
of C. psittacula invaded from Santa Cruz (2), and remained unchanged. However, Kleindorfer et al” now 
report that this large species is no longer found on Floreana. 


wail from the ghost of an interbreeding past. 

Although there is some uncertainty about 
hybrid identification in this study, the disap- 
pearance of a species through hybridization is 
certainly plausible. On the small uninhabited 
island of Daphne Major, two species of ground 
finch (genus Geospiza) have been converging 
morphologically and genetically for more than 
30 years as a result of persistent (although rare) 
introgressive hybridization following a natu- 
ral change in the food supply. If introgression 
continues at the same rate, the two species will 
fuse into one in approximately 40 years’. The 
finches of Daphne Major are also a reminder 
that, under special circumstances, hybridiza- 
tion can lead to the opposite outcome — the 
formation of a new species. A new genetic 
lineage has become established on Daphne by 
an immigrant hybrid from Santa Cruz, and is 
now behaving as a separate species (see ref. 7 
for further details). 

Kleindorfer and colleagues’ findings 
suggest that the small and medium tree finches 
on Floreana may also be fusing into one spe- 
cies. The authors raise the intriguing possibil- 
ity that hybrids between these populations have 
an immunological advantage over the parental 
species in the face of attack by a parasitic fly, 
Philornis downsi, whose larvae eat and kill finch 
nestlings. The fly was introduced to the archi- 
pelago 50 years ago’. Their study is important 
because it adds weight to a growing concern 
that we humans are causing loss of biodiversity 
by altering habitats, in some cases by bringing 
separate species into proximity and causing 


their extinction through interbreeding". 


Rapid radiations of fishes'’”” and finches’ are 
especially at risk because their morphologi- 
cal evolution is not accompanied by strong 
barriers to gene exchange. Uniquely valuable 
in showing how speciation is done””’, such 
species deserve special protection from being 
artificially undone. m 
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The present and future role of 
microfluidics in biomedical research 


Eric K. Sackmann!, Anna L. Fulton? & David J. Beebe® 


Microfluidics, a technology characterized by the engineered manipulation of fluids at the submillimetre scale, has shown 
considerable promise for improving diagnostics and biology research. Certain properties of microfluidic technologies, 
such as rapid sample processing and the precise control of fluids in an assay, have made them attractive candidates to 
replace traditional experimental approaches. Here we analyse the progress made by lab-on-a-chip microtechnologies in 
recent years, and discuss the clinical and research areas in which they have made the greatest impact. We also suggest 
directions that biologists, engineers and clinicians can take to help this technology live up to its potential. 


potential to significantly change the way modern biology is 

performed”’. Indeed, we were part of a chorus of researchers 
that recognised the possibility of new microfluidic tools making sub- 
stantial contributions to biology and medical research**. The optimism 
surrounding microfluidics was well warranted, given the compelling 
advantages that microfluidic approaches could possibly have over tra- 
ditional assays used in cell biology. Conceptually, the idea of microflui- 
dics is that fluids can be precisely manipulated using a microscale device 
built with technologies first developed by the semiconductor industry 
and later expanded by the micro-electromechanical systems (MEMS) 
field. These devices, commonly referred to as miniaturized total analysis 
systems (11TASs)*’ or lab-on-a-chip (LoC) technologies, could be applied 
to biology research to streamline complex assay protocols; to reduce the 
sample volume substantially; to reduce the cost of reagents and maximize 
information gleaned from precious samples; to provide gains in scalabi- 
lity for screening applications and batch sample processing analogous to 
multi-well plates; and to provide the investigator with substantially more 
control and predictability of the spatio-temporal dynamics of the cell 
microenvironment. 

The field of microfluidics is characterized by the study and manipu- 
lation of fluids at the submillimetre length scale. The fluid phenomena 
that dominate liquids at this length scale are measurably different from 
those that dominate at the macroscale (Box 1). For example, the relative 
effect of the force produced by gravity at microscale dimensions is greatly 
reduced compared to its dominance at the macroscale. Conversely, sur- 
face tension and capillary forces are more dominant at the microscale; 
these forces can be used for a variety of tasks, such as passively pumping 
fluids in microchannels*; precisely patterning surfaces with user-defined 
substrates’; filtering various analytes'’; and forming monodisperse drop- 
lets'' in multiphase fluid streams for a variety of applications. These 
examples represent only a fraction of the myriad problems that micro- 
fluidic technologies have attempted to address. 

The development of comprehensive microfluidic solutions to address 
problems in biology and clinical research has been embraced by engi- 
neers. However, despite material advances in microfluidics as a techno- 
logy platform, the adoption of novel ,1TAS techniques in mainstream 
biology research has not matched the initial enthusiasm surrounding 
the field’’. Some argue the technology is still in search of a ‘killer applica- 
tion’, where the sample-to-answer concept provides a solution that greatly 


M ore than a decade ago, we wrote that “microfluidics has the 


outperforms current methods'**. In this perspective, we will examine the 
impact of microfluidic technologies on cell biology and medical research 
within the past decade. We discuss some of the barriers to adoption of 
microfluidic technologies in mainstream biomedical research, and use 
a case study to illustrate and highlight these challenges. We focus our 
attention on recent developments in the field that are facilitating the 
application of microfluidic technologies to solving problems in diagnost- 
ics and biology research. In this area, we highlight the innovative use of 
different materials that are more optimally suited to performing a given 
task; and we examine how researchers are taking advantage of uTAS 
methods to enable scientific inquiry in ways that were not possible using 
traditional methods. Finally, we will discuss positive trends in the field 
and infer lessons that can be applied to future microfluidic technology 
development. 


The impact of microfluidics on biomedical research 


A primary goal for much of the microfluidics community is to develop 
technologies that enhance the capabilities of investigators in biology and 
medical research. Many microfluidic studies describe methods that aim 
to replace traditional macroscale assays, and usually perform proof-of- 
concept (PoC) experiments that attempt to demonstrate the efficacy of 
the new approach. These novel microfluidic methods are usually pub- 
lished in journals that might be characterized as ‘engineering’ journals, 
or publications whose readership comprises largely engineers and other 
members of the physical sciences (for example, chemists and physicists). 
If publishing PoC studies in engineering journals represents the devel- 
opment phase for a novel biology assay, then the implementation of the 
technique can be characterized as when the technology is used and pub- 
lished in a biology or medical journal. After all, the stated goal of vir- 
tually all PoC studies is to demonstrate new technologies that enable 
biologists in their everyday research. 

We measured the use of microfluidic technologies in mainstream bio- 
medical research over the past decade to assess their impact beyond the 
engineering community (Fig. 1). In order to identify broad trends of 
what journals have published papers that use microfluidics (search terms 
“microfluidic*” and “nanofluidic*”; see Fig. 1 legend), we defined three 
categories: (1) ‘engineering’ journals (for example, Lab on a Chip, Small, 
Analytical Chemistry); (2) ‘biology and medicine’ journals (for example, 
Blood, Cell, Journal of Clinical Investigation); and (3) ‘multidisciplinary’ 
journals (for example, Nature, Science, Proceedings of the National Academy 


1Materials Science Program, Department of Biomedical Engineering, Wisconsin Institutes for Medical Research, University of Wisconsin-Madison, 1111 Highland Avenue, Madison, Wisconsin 53705-2275, 
USA. 2Wendt Commons Library, University of Wisconsin-Madison, 215 North Randall Avenue, Madison, Wisconsin 53706, USA. *Department of Biomedical Engineering, Wisconsin Institutes for Medical 
Research, University of Wisconsin-Madison, 1111 Highland Avenue, Room 6009, Madison, Wisconsin 53705-2275, USA. 


13 MARCH 2014 | VOL 507 | NATURE | 181 


©2014 Macmillan Publishers Limited. All rights reserved 


REVIEW 


BOX | — Microfluidics in engineering journals ~1.200 


Useful microfluidics concepts — Microfluidics in multidisciplinary journals 


——— Microfluidics in biology and medicine journals 

Laminar versus turbulent flow. The Reynolds number (Re) is a 
dimensionless quantity that describes the ratio of inertial to viscous 
forces in a fluid. Re is proportional to the characteristic velocity of the 
fluid and the length scale of the system; it is inversely proportional 
to the fluid viscosity. High-Re (~2,000) fluids have flow profiles 
that increasingly mix stochastically (turbulent flow; Box 1 Figure 
below). For microfluidic systems, Re is almost always in the laminar 
flow regime, allowing for highly predictable fluid dynamics. Molecular 
transport also changes dramatically at this scale because convective 
mixing does not occur, enabling predictable diffusion kinetics. 

Surface and interfacial tension. Surface tension describes the 
tendency of a fluid in a surface to reduce its free energy by contracting 
at the surface-air interface. Interfacial tension is a similar 
phenomenon, but is generally applied to two immiscible fluids (for 
example, oil and water). These forces play more dominant roles on 
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material (Box 1 Figure below). At the microscale, capillary action is a 
more dominant force, allowing fluids to advance in opposition to 
gravity. Capillary forces have been used to manipulate fluids in 
many applications, the most famous examples perhaps being the 
at-home pregnancy test and portable glucometers to monitor blood 


Figure 1 | Microfluidic publications in engineering, multidisciplinary, and 
biology and medicine journals from 2000 to 2012. a, In 2012, there were 
roughly 10 times more microfluidic publications in engineering journals 
compared to biology and medicine (biomedical) journals (left-hand pie chart 


glucose levels. inset). However, the share of microfluidics papers being published in 
multidisciplinary journals decreased as publication share in biomedical and 
Laminar versus turbulent flow engineering journals increased (right-hand pie chart). b, Word cloud 
Laminar flow Turbulent flow illustrating what fields most frequently used microfluidics. The size of the 
Ah font is proportional to the cumulative number of publications in the Web of 
a } Science (WoS) category (2000-12), with the exception of ‘cell biology’, 
~ which would need to be ~5 times larger. Methodology of the searches was 


as follows. A literature search was performed using WoS (provided by 
Thomson Reuters) to determine the number of microfluidics publications in 
various disciplines. The search was performed for the terms “microfluidic*” 
Liquid (for example, water) and “nanofluidic*”. The number of publications were obtained from the WoS 
analytics reporting system for each search term, and then summed before 

ol being presented above. Three categories were characterized by the WoS search 


that capture the relevant journals for the years 2000-12. The analysis shown 
O a here as “Microfluidics in engineering journals” reports the number of 
microfluidic publications in the “Nanoscience and nanotechnology’ WoS 


category. The analysis shown here as “Microfluidics in multidisciplinary 
journals’ corresponds to the ‘Multidisciplinary’ WoS category. The analysis 
shown here as ‘Microfluidics in biology and medicine journals’ reports 
publications from WoS categories shown in Fig. 1b. The search explicitly 
excluded ‘reviews’, ‘book chapters’, ‘book reviews’, ‘meeting abstracts’, ‘meeting 
summaries’, and included ‘articles’. The data shown reflects the most 
recent literature search, performed on 21 March 2013. The following 
search general string was used: Topic = (microfluidic*) AND Year 
Published = (2000-2012) AND Document Types = (Article) NOT 
Document Types = (Book OR Book Chapter OR Book Review OR Meeting 
Abstract OR Meeting Summary OR Proceedings Paper OR Review). This 
string yielded the total ‘microfluidic*’ publications in all WoS categories 
(* allows for permutations of the keyword). The search was then refined 
by the WoS categories shown above (for example, Web of Science 
Categories = (MULTIDISCIPLINARY). Importantly, the nominal results 
of this search would probably vary if other search tools such as SCOPUS, 


Surface and interfacial tension 


Capillary forces 


of Sciences). The results reveal, unsurprisingly, that the overwhelming num- 


ber of microfluidics papers are still being published in engineering journals Gotgle Scholar and PubMed were used?" For example, the srossnambetof 


(Fig. 1a). These engineering journals have facilitated the technological de- publications would probably increase if SCOPUS were used for the search, 
velopment and growth of microfluidics over the past decade. Itisimportant 4. this tool indexes a higher number of journals than WoS”. 


to note that some of these ‘engineering’ studies may have been designed 

for non-biomedical purposes, but this does illustrate where the majority _ the microfluidics community has grown substantially, and ‘biology and 
of microfluidic activity and exposure has occurred. Today the majority _ medicine’ journals have taken some publication share from interdiscip- 
of microfluidics publications still appear in engineering journals (85%) as__linary journals (9% and 6%, respectively). 
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Last, we analysed what fields within the biomedical research commu- 
nity are using microfluidic technologies the most (Fig. 1b). “Cell biology’ 
and ‘Biology’ encompass most of the microfluidics publications, possibly 
because these categories are somewhat generic and incorporate several 
subcategories. Following these, the most use of microfluidics is seen in 
‘Haematology’, ‘Medicine and experimental research’ and ‘Immunology’. 
Most of these publications are for diagnostic applications (in the case of 
Medicine and experimental research) and the manipulation of blood sam- 
ples for biology research (Haematology and Immunology)—applications 
where microfluidics has compelling advantages over traditional methods. 
However, despite these few examples, the evidence suggests that a ‘killer 
application’ that propels microfluidics into the mainstream has yet to emerge. 


A case study in chemotaxis assays 


The state of the art for most conventional assays used in cell biology 
research is evolving and improving over time. Biologists understand 
better than anyone the deficiencies of the techniques they use, and indi- 
vidual groups occasionally make modifications to traditional assays that 
are adopted more broadly by other biology researchers. An example of 
this technological evolution can be observed in visual chemotaxis assays— 
techniques that measure the directional migration ofa cell in response toa 
source of chemotactic factors that change concentration in space and time. 

Chemotaxis assays have improved substantially since their initial intro- 
duction in the 1960s (Fig. 2). The most widely used chemotaxis assay is 
known as the ‘Boyden chamber or ‘Transwell’ assay, developed in 1962 by 
Boyden’’. The Transwell assay works by creating a concentration gradient 
of chemoattractant compounds between two wells that are separated by 
a microporous membrane. Chemotactic cells located in the upper well sense 
the gradient in concentration and migrate across the membrane towards 
the solution in the lower well where the cells are counted. Its simplicity 
and ease of use (no special instrumentation is required) has contributed 
to its widespread use over the past 50 years. Investigators have used the 
method to identify chemotactic factors for various cell types, despite the 
fact that the technique disallows observation of the cell migration path 
or cell morphology. This experimental limitation (along with others) led 
to the development of visual chemotaxis assays such as the Zigmond 
chamber"®. In this system, cells can be observed as they undergo chemo- 
taxis on a coverslip across a narrow constriction (tens of micrometres) 
towards a source chemoattractant. It is worth noting that the Zigmond 
chamber is a microfluidic device developed by biologists at least a decade 
before the emergence of the microfluidic/j!TAS field as we know it. Im- 
portantly, this technique allows for clear imaging of cell migration and 
morphology. Modifications to this design, called the Dunn” and Insall"* 
chambers, were subsequently developed, and these advances substan- 
tially improved the high-resolution, long-term imaging capabilities of 
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visual chemotaxis assays (Fig. 2). The Insall chamber represents the most 
recent of a long evolution of direct-viewing chemotaxis chambers that 
have been developed over the course of three decades. 

Microfluidics has offered many solutions for next-generation chemo- 
taxis assays (reviewed in refs 19 and 20); however, none of these methods 
have seen widespread adoption at the level of the aforementioned tra- 
ditional assays. Additionally, efforts to commercialize microfluidic che- 
motaxis assays—notable products include [1-Slide Chemotaxis (ibidi), 
Iuvo Chemotaxis Assay Plate (BellBrook Labs), and EZ-TAXIScan (Effec- 
tor Cell Institute)—have had limited success in the marketplace. The 
generation of chemical gradient profiles is an area where microfluidic 
technologies are uniquely qualified because of the highly predictable”', 
diffusion-dominant characteristics of the fluid flow at this scale (Box 1). 
Yet traditional assays are still predominantly used for chemotaxis studies 
in cell biology research. The low adoption rate of microfluidic chemotaxis 
assays may be due to the fluid handling expertise and infrastructure 
required in early designs’, which may have acted as a barrier to entry 
for biologists”. Recently published microfluidic chemotaxis techniques 
are beginning to take usability requirements into consideration, and de- 
monstrate simpler chemotaxis assay designs that do not require active 
pumping systems*’’. Another possibility is that biologists are more com- 
fortable with using the existing direct-viewing chemotaxis assays that have 
been developed and vetted over nearly 40 years (Fig. 2). Notably, each 
iterative improvement on the Zigmond chamber design was published 
by investigators with appointments in biology (Zigmond); experimental 
pathology (Boyden and Dunn); and cancer research (Insall)—none of the 
designs were produced from ‘engineering’ disciplines. These technical 
advances were made by biologists to address unmet needs in their own 
research. And in the case of visual chemotaxis, the methods were, in fact, 
microfluidic by any reasonable definition, yet they are not typically in- 
cluded within the microfluidic vernacular. In the case of chemotaxis 
assays, engineers have sometimes erred by imposing technological com- 
plexity and functionality where it was not necessarily needed or wanted. 
This case study illustrates the continuing need for engineers and biolo- 
gists to work closely during assay development to create usable and robust 
solutions that build on biologically validated approaches, while adding 
functionality that allows new avenues of biological inquiry. 


Materials tailored for specific applications 


Unlike the semiconductor industry where silicon is the backbone mater- 
ial on which the technology has been built**”’, the materials used for 
developing microfluidic devices have undergone a large transition over 
the years. Early pTAS devices were fabricated from silicon*® and glass*! 
using clean-room techniques that were translated to microfluidic device 
fabrication. This was largely a choice of convenience (because the techniques 
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and facilities were already in place) and necessity (early microfluidics 
focused largely on electrophoretic phenomena where glass is a preferred 
material), but not a long-term solution for cell biology research. Silicon is 
opaque to visible and ultraviolet light, making this material incompatible 
with popular microscopy methods. Glass and silicon are both brittle 
materials, they have non-trivial bonding protocols for closing micro- 
channels, and in general they require expensive, inaccessible fabrication 
methods. These materials were well suited for some applications (for 
example, electrophoresis), but were ultimately limited in their growth 
potential. Cheaper, more accessible materials and fabrication methods 
were needed to fuel the growth of microfluidic technology development 
and adoption. 

Elastomeric micromoulding techniques were developed by Bell Labs 
in the 1970s”, and first applied to microfluidics and cell biology in the 
1980s**. In 1998, Whitesides used polydimethylsiloxane (PDMS)—an 
optically transparent, gas- and vapour-permeable elastomer— for the 
fabrication of more complex microfluidic devices** and helped ‘soft 
lithography’ become the most widely adopted method for fabricating 
microfluidic devices. It would be hard to exaggerate how important and 
enabling PDMS has been for microfluidics, contributing to the growth of 
the field in both technological development and number of publications”. 
Adoption of the material can be attributed to several key factors, includ- 
ing (1) the relatively cheap and easy set-up for fabricating small numbers 
of devices using PDMS in a university setting; (2) the ability to tune the 
hydrophobic surface properties to become more hydrophilic**”’; (3) the 
ability to reversibly and (in some cases) irreversibly bond PDMS to glass, 
plastic, PDMS itself, and other materials; and (4) the elasticity of PDMS, 
which allows for easy removal from delicate silicon moulds for feature 
replication. In addition to the practical fabrication considerations of using 
an elastomer, there are also useful functional advantages. Researchers 
have used the elasticity of PDMS to create micropillar arrays that assay 
the mechanobiology of various cell types***’. However, perhaps most im- 
portantly, the elasticity of PMDS allows for valving and actuation’, 
which has led to a plethora of microfluidic designs and publications. 
Fluidigm—the largest commercial 1 TAS technology company currently 
in the market—build their microfluidic systems using deformable elas- 
tomers (NanoFlex valves). 

Despite all the beneficial properties of PDMS that enabled its rapid 
adoption amongst university engineers, there are several limitations to 
implementing the material in biomedical research. For example, PDMS 
has been found to leach uncrosslinked oligomers from the curing pro- 
cess into solution”, requiring additional device preparation to mitigate 
this potentially harmful effect*’. Additionally, PDMS has been shown to 
absorb small molecules**, which can affect critical cell signalling dyna- 
mics. Furthermore, the vapour permeability of PDMS means that evap- 
oration can occur in an experiment*, which can be detrimental for cell 
microenvironments at micro- and nanolitre fluid volumes**””. Strategies 
such as parylene coating the microchannel surface’ and other techniques” 
have been developed to mitigate these problems, but these processes are 
consequences of deploying a non-ideal material for cell biology applica- 
tions—the often cited ‘biocompatability’ of PDMS appears to be some- 
thing of a misnomer. Last, the manufacture and distribution of PDMS 
devices to collaborators is not easily scalable, because high-throughput 
methods such as injection moulding, rolling and embossing cannot be 
used for PDMS devices. Thus, making PDMS prototypes for iterating on 
anew design concept is relatively easy, but making many of these devices 
and packaging them for collaborators or commercialization is non-trivial”. 
Given these limitations, clearly PDMS is not a one-size-fits-all material for 
all microfluidic applications, and particularly for cell biology research”. 

The limitations of PDMS have prompted researchers to explore alter- 
native materials in recent years (Fig. 3). In the microfluidics community, 
there has been a push towards the use of thermoplastics such as poly- 
styrene and cyclic olefin copolymer™ for microfluidic devices (Fig. 3A), 
although some research laboratories have always used these materials in 
lieu of PDMS**°°. Thermoplastic materials such as polymethyl metha- 
crylate and polycarbonate**”’ were popular for the fabrication of 1TAS 
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Figure 3 | Materials other than PMDS are being used for microfluidic device 
design. A, Several research groups have demonstrated accessible methods 

of thermoplastic microfluidic device fabrication. Examples of various 
microfluidic designs fabricated in polystyrene are shown. B, C, Paper (B), 
and to a lesser extent, wax (C) are being used in the developing world for 
diagnostic applications owing to benefits in device cost, operation and 
destructibility with limited waste infrastructure. B, An example of a 
paper-based microfluidic device for detecting glucose and protein. The 
integrity of the hydrophilic patterning is shown with a red dye (a); the detection 
zones for glucose (circular region on left) and protein (square region on 
right) are also shown (b); and representative tests detecting a single 
concentration (c) and multiple concentrations (d) of protein and glucose 
from an artificial urine sample are also shown. C, An example of a wax 
microfluidic device (zoomed view in inset) that can perform an enzymatic 
immunoassay. Figure sources, used with permission: A, ref. 59; B, ref. 64; 

C, ref. 101. B is adapted with permission from Martinez, A. W., Phillips, S. T., 
Whitesides, G. M. & Carrilho, E. Diagnostics for the developing world: 
microfluidic paper-based analytical devices. Analytical Chemistry 82, 

3-10 (2010). Copyright (2010) American Chemical Society. 
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devices in the 90s, but lost favour with researchers because the fabrication 
methods were more difficult and expensive than those of PDMS for the 
typical academic laboratory. However, the microfluidics community has 
addressed this issue by developing more accessible fabrication methods 
for thermoplastic uTAS devices**, although these techniques are not 
without limitations***". We have recently argued that polystyrene should 
be preferred over PDMS for many cell biology applications, particularly 
because biologists have a long history of using polystyrene for cell culture”. 
Furthermore, the use of polystyrene mitigates or eliminates many mater- 
ial property issues associated with PDMS, including the bulk absorption 
of small molecules and evaporation through the device, and polystyrene 
makes handling and packaging easier for use in collaborations. 

In addition to thermoplastic materials, there has been substantial pro- 
gress in using destructible, cheap materials such as paper (Fig. 3B), wax 
(Fig. 3C) and cloth” for point-of-care applications in low-resource set- 
tings. These materials have the benefit of being cheap and easily incin- 
erated®, making them ideal choices for settings where safe disposal of 
biological samples is challenging*®. Currently there is increasing acti- 
vity in developing microfluidic paper-based analytical devices (uPADs). 
These PAD devices are expansions on tried-and-tested lateral flow assays 
(for example, pregnancy strip test) and operate by passively wicking bio- 
logical samples through patterned hydrophilic regions using capillary forces; 
they often use colorimetric readouts. The hydrophobic channel patterning 
can be accomplished using a variety of methods, such as wax printing”, 
photolithographic patterning of photoresist”, inkjet printing of PMDS®, 
and flexographically printed polystyrene®. ,PAD devices are becoming 
increasingly sophisticated” ”°, with a recent study demonstrating a single- 
step enzyme-linked immunosorbent assay (ELISA) for the detection of 
human chorionic gonadotropin”!. 

The movement beyond PDMS with the use of thermoplastics and 
other materials is a positive development for the microfluidics commu- 
nity. Rather than solely relying on PDMS for device fabrication regard- 
less of its limitations, researchers are beginning to consider new materials 
that more suitably meet the requirements of biological assays and are 
amendable to high-throughput manufacturing. The shift to materials 
beyond PDMS enables researchers to more effectively export technolo- 
gies in scale, and allows for new solutions to problems in performing cell 
biology and diagnostic assays. However, different materials often require 
a re-thinking of component design. For example, it is difficult to imple- 
ment the displacement valves and pumps so ubiquitous in PDMS devices 
in other non-elastic materials. Therefore, technological progress using 
alternative materials will require creative new approaches from engineers 
that design powerful and user-friendly uTAS devices. 


When pTAS technologies are the only solution 


Most of the microfluidic technologies that were developed for cell biology 
applications in the early 2000s sought to improve on existing macroscale 
assays. Many of these technologies delivered on the promised perform- 
ance improvements, yet were never adopted by mainstream biology 
researchers. Another possible reason for this lack of adoption, beyond 
those we have previously discussed, is that these technologies are im- 
provements on established techniques. Although microfluidic methods 
may in some cases be technologically superior, they are often only iter- 
ative improvements on methods that already exist. Someone interested in 
performing protein analysis might conduct a western blot or ELISA. To 
study cell chemotaxis, a researcher might perform a Transwell assay. To 
investigate tissue regeneration after a wound, an investigator might scratch 
some cells with a micropipette tip and see what happens. Microfluidic 
techniques exist that perform many of these assays with equivalent or 
improved performance”, but they have not offered fundamentally new 
capabilities compared to the current state-of-the-art. 

Within the past several years there have been a growing number of 
microfluidic technologies that solve problems that have not yet been 
addressed by macroscale approaches. Two recognizable examples that 
embody this distinction can be found in the glucometer and the preg- 
nancy test (or more broadly, lateral flow assays). Each test passively wicks 
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bodily fluids into porous materials, either blood (glucometer) or urine 
(pregnancy test), and performs a previously complex biochemical assay 
in a single step to provide an immediate measurement. Although there 
were benchtop assays that could perform these tasks, the portability and 
rapid feedback these assays provided was transformative for the end user. 
There are currently applications like these for which microfluidic meth- 
ods have demonstrable advantages over traditional methods. These vari- 
ous applications share overlapping qualities that make them potentially 
useful techniques. However, for the purpose of this discussion, we will break 
them into three categories: diagnostic devices for low-resource settings; 
the rapid processing of biofluids for research and clinical applications; 
and more physiologically relevant in vitro models for drug discovery, 
diagnostics and research applications. 


Diagnostics for low-resource settings 

The western model of centralized laboratories processing clinical sam- 
ples with expensive equipment does not translate well to the developing 
world. Many low-resource settings do not have the means or infrastruc- 
ture to perform these tests and analyses, necessitating creative alternative 
solutions to meet this largely unsolved problem. Microfluidic methods 
are being developed to perform a variety of diagnostic tests with built-in 
analysis capabilities that are compatible with the infrastructure in the 
developing world (Fig. 4). As discussed earlier, new material systems such 
as paper, wax and others are being explored in this area**°*”*, Common 
themes with these devices include being ultra-simple to operate and the 
provision of some qualitative or quantitative output that can be measured 
with low-cost and ubiquitous equipment (for example, a mobile-phone 
camera or scanner). Also, ideally, the materials used to make the devices 
are easily destructible to avoid unsafe contamination, and are cheap and 
scalable to manufacture (preferably locally). In a recent study, Chin et al.” 
aimed to meet these requirements in a microfluidic chip that performs an 
ELISA-like assay within ~20 minutes using volumes of blood that can be 
obtained from a lancet puncture (Fig. 4). Importantly, the assay did not 
require external pumping systems; it emphasized straightforward opera- 
tion; and it used cheap photodetectors for the rapid optical readout. The 
authors analysed more than 70 blood samples obtained from a hospital 
in Rwanda and successfully diagnosed human immunodeficiency virus 
(HIV) in all but one patient, achieving sensitivity and specificity values 
that rival a laboratory-based ELISA test. This study and others are prom- 
ising indications that )TAS technologies could make meaningful con- 
tributions to healthcare in the developing world. 

Low cost is arguably the most important feature when aiming to 
increase access to diagnostics in the developing world, but it is also an 
increasingly important factor in the developed world. If we can achieve 
appropriate performance/cost combinations for the developing world, 
many believe that these technologies will play an important role in 
transforming the way medicine is delivered in the developed world by 
enabling in-home testing and treatment. However, traditional lateral flow 
assays achieve a low-cost/high-performance benchmark, and thus re- 
present a high standard against which new approaches are compared. 


Rapidly assaying biofluids with microfluidics 

Engineers have made use of properties unique to the microscale to 
enable studies that would be difficult or impossible using macroscale 
approaches (Fig. 5). These methods have found clinical applications, 
because they use ultra-low volumes of biofluids for the sample proces- 
sing and can usually be accomplished rapidly and easily. To some degree 
these assays mimic what macroscale assays accomplish, but the methods 
offer new approaches that enable fundamentally new applications. For 
example, the rapid purification and analysis of neutrophils—the phago- 
cytotic cells that are first responders for the innate immune system— 
have been demonstrated in several studies in recent years for clinical and 
research applications**’*”*. Importantly, these techniques reduce blood 
processing times from roughly an hour (using millilitres of blood from a 
venipuncture”®) to a few minutes (using only microlitres of blood from a 
finger prick). Thus, the methods can be applied to measure neutrophil 
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Figure 4 | Diagnostics in the developing world. These are excellent 
examples of exploiting the benefits of 1TAS technologies where classical 
(Western) diagnostic paradigms fail. a, A user-friendly cartridge to perform 
enzyme-linked immunosorbent assays (ELISAs) for the diagnosis of HIV 
and other diseases. A schematic showing the functional steps of the assay is 
shown on the left and the microfluidic device is shown on the right. b, 3D uPAD 
showing complex fluid handling operations that occur passively in a paper 


function for diagnostic and research purposes, enabling a new class of 
studies that have previously been beyond the capabilities of macroscale 
methods”. Other purification schemes have been developed that take 
advantage of the increased dominance of surface tension at the micro- 
scale to sort target analytes in biofluids across multiphase barriers (for 
example, oil and water; see Box 1) using fast and simple procedures’”””’. 
Not only is this purification scheme simpler and faster than most macro- 
scale methods, but improved sensitivities for protein and genetic pur- 
ifications may be achievable owing to a reduction in the number of wash 
cycles required to carry out an experiment. These applications are only 
some of the examples where microscale benefits are being used to per- 
form experiments that are not reasonably achievable using macroscale 
techniques. 


More physiologically relevant in vitro models 

The pharmaceutical industry is currently faced with unsustainable re- 
search and development (R&D) costs”*° that require it to change how 
the development and approval of new drugs are pursued*'*’. The indus- 
try faces multiple headwinds, such as the exclusivity on blockbuster drugs 
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device for diagnostics in resource-limited settings. The example shows the flow 
of several coloured dyes in a patterned PAD device, with a cross-section of 
the 3D structure also shown. Figure sources, used with permission: a, ref. 73; 
b, ref. 64. b is reprinted with permission from Martinez, A. W., Phillips, S. T., 
Whitesides, G. M. & Carrilho, E. Diagnostics for the developing world: 
microfluidic paper-based analytical devices. Analytical Chemistry 82, 3-10 
(2010). Copyright (2010) American Chemical Society. 


soon expiring for several companies, and dramatically fewer new drugs 
being approved by the Food and Drug Administration (FDA) in recent 
years. These circumstances necessitate new strategies for drug develop- 
ment that increase R&D productivity in order to avoid a potential drought 
in effective new drugs coming to market. 

Microfluidics researchers are taking aim at this problem by develo- 
ping potentially transformative technologies to mitigate the cost of new 
drug development. A new class of microfluidic devices seeks to replicate 
in vivo organ function on a microchip (Fig. 6). This new class of so called 
“organ-on-a-chip’ technologies integrates several well-understood micro- 
fluidic components into a single in vitro device, allowing researchers to 
more closely recapitulate in vivo function (both normal and disease states). 
This ambitious effort is still in its infancy, though several promising studies 
have developed examples of these biomimetic systems. Examples of organ 
(or disease)-on-a-chip technologies include gut-on-a-chip™, lung-on- 
a-chip*’, blood vessel-on-a-chip****, cancer-on-a-chip**' and kidney- 
on-a-chip”’. Furthermore, these modular systems could theoretically be 
combined into a complete ‘human-on-a-chip’ model that mimics in vivo 
function of these organs working in concert”’. The result would be a class 


Figure 5 | Rapid purification microfluidic systems. a, A microfluidic device 
to purify neutrophils within minutes using antibody-based capture for 
subsequent diagnostic or research analysis. The microfluidic device is shown at 
the upper left with stained neutrophils that have been sorted from whole blood 
below (scale bar = 20 ktm); an illustration of the neutrophils captured within 
the microchannels by antibodies (zoomed view in inset) is also shown. b, A 
technique to purify target analytes such as RNA, cells and proteins by simply 
sliding a magnet across an immiscible aqueous-oil interface. An example 
shown here illustrates four steps to purify protein from a sample (zoomed view 
to the right shows detail) by (1) removing the analyte bound to paramagnetic 
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particles across the first aqueous-oil barrier, (2) binding a primary antibody to 
the analyte and dragging it across another aqueous-oil barrier, (3) binding a 
fluorescently labelled secondary antibody to the complex and bringing it across 
another aqueous-oil barrier into the imaging well (4), where the fluorescence is 
measured to detect the amount of analyte (white and grey fluorescent image). 
Figure sources, used with permission: a, ref. 74; b, ref. 10. b is adapted with 
permission from Berry, S. M., Maccoux, L. J. & Beebe, D. J. Streamlining 
immunoassays with immiscible filtrations assisted by surface tension. 
Analytical Chemistry 84, 5518-5523 (2012). Copyright 2012 American 
Chemical Society. 
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of sophisticated in vitro assays with which drugs could be tested, in the 
hope of increasing the predictability of a new drug (that is, hit rate) before 
animal testing (possibly even replacing animal trials) and human clinical 
trials. In a tangential application, blood vessel-on-a-chip devices have 
already been used for the diagnosis of sickle cell disease in the clinic**™*. 
For example, Tsai et al.*° described a microfluidic chip that recapitulated 
in vivo conditions of a blood vessel—such as blood flow rate, endothelial 
cell shear stress and biochemical activation states—in order to reliably 
detect vascular occlusions due to sickle-cell disease. This system highlights 
how certain properties of microfluidic systems, such as high-resolution 
micropatterning and precise control of the haemodynamic and shear 
profiles in the microchip, enabled the measurement of biophysical abnor- 
malities in a clinical setting. Much more work is still required before 
organ-on-a-chip methods can be adopted in mainstream drug R&D, al- 
though early developments in this area are promising. Indeed, AstraZeneca— 
a multinational pharmaceutical company—has recently announced a 
collaboration with Harvard’s Wyss Institute to research the integration 
of organ-on-a-chip technologies into their drug development. 


Where we go from here 


The question of how to increase the adoption of microfluidic technolo- 
gies in mainstream biomedical research remains largely unanswered, 
and we argue there are no guaranteed routes to achieve adoption. We 
have shown that microfluidic technologies are being used for some studies 
in biology research and diagnostic applications; however, the large major- 
ity of microfluidics publications are still in technical journals specific to 
the field (Fig. 1). Adoption of new technologies that supplant or even 
complement existing methods is often a slow process. For evidence of 
this, we consider the computer mouse, which took 20 years to appear in 
the Macintosh computer after its invention by Engelbart in the 1960s. But 
this does not mean microfluidics engineers should become disillusioned 
or discouraged. Researchers in the field must develop deliberate and 
thoughtful strategies that will best push the technology forward. We now 
have several decades of experience to draw on, and there are some useful 
lessons we can apply. 
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Figure 6 | Organ-on-a-chip assays for drug development and specialized 
diagnostic applications. a, Complex microsystems can be developed to 
recreate an organ’s physiology, such as the physiology of the lung, directly ona 
microfluidic device. The diagram illustrates a biomimetic microfluidic 

design that actuates stretching of tissue in a breathing-like manner by using 
vacuum in side chambers to strain the cell-coated PDMS membrane. This 
process mimics the reduction in intrapleural pressure (P;,) in the lungs during 
breathing. b, Biomimetic blood vessel and capillary networks can also be 
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Fostering mutually beneficial collaborations 

During the early years of microfluidics, the field did not have a successful 
strategy for transferring technological developments to non-engineering 
users. Perhaps the idea was that researchers from the biology community 
would rush to work out how to make use of these new technologies. 
Clearly this formula of engineers and biologists leading separate academic 
lives does not benefit either community. Fortunately researchers have 
acknowledged that a divide between the developers of the technology 
and the end-users is counterproductive. Most of the recent microfluidics 
papers published in “Biology and medicine’ journals are co-authored by 
engineers, biologists and clinicians. This evidence of increasing collab- 
oration is a promising development for everyone involved. In order to 
sustain this trend, microfluidic researchers should court collaborators 
from biology and clinical laboratories (and vice versa). Direct interac- 
tion and feedback from the end-user is tremendously beneficial during 
technology development. Furthermore, new applications and ideas can be 
generated from biology collaborators that engineers—being non-experts 
in cell biology or clinical research—would never have considered. 


The simplest solution is almost always best 

All the signs indicate that there is no simple solution for accelerating the 
adoption process; however, there are design choices engineers can make 
in order to lower the barrier to entry for biologists. How the end-user 
interacts with a new technology is a critical aspect of whether the method 
is adopted. Microfluidics engineers have been attempting to simplify fluid 
handling challenges in their designs with passive pumping approaches 
that only require a micropipette to operate*”**””””>*, Additionally, some 
have explored the use of centrifugal forces to perform complex assays 
using a ‘lab-on-a-CD’ design’®. Many microfluidic applications require 
the use of external pumps and pneumatic fluid handling systems; exam- 
ples include most organ-on-a-chip devices and techniques that require 
continuous flow to generate specific shear profiles (for example, biomi- 
metic blood vessel models). However, engineers should limit the use of 
these external systems whenever possible. Creating a simpler approach 
often requires more creative solutions, but this can greatly improve the 


b 


Cross-section 


recreated in vitro to diagnosis SCD and other diseases involving blood vessel— 
whole blood interactions. An image of the microfluidic device is shown (top) 
next to a penny for scale, with a diagram at right showing the increasingly 
narrow capillary network; confocal microscopy images of the endothelial cell- 
lined lumens within the device are also shown (bottom) with the cell nucleus 
(blue) and cell membrane (red) visible. Scale bars: 600 um (black), 30 um 
(white). Figure sources, reprinted with permission: a, ref. 85; b, ref. 86. 
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experience for the end-user. Paper diagnostic assays are an excellent exam- 
ple of single-step, automated and user-friendly TAS solutions where the 
technology is not visible and the user can focus on interpreting the results™. 
We have recently developed a similarly straightforward, automated ap- 
oproach for general cell biology applications that does not require exter- 
nal pumping equipment or even a micropipette to perform complex assay 
protocols”’. General problems of packaging and distributing microfluidic 
technologies to collaborators will also need to be addressed until micro- 
fluidic assays become more commercially viable in the academic research 
market. These problems should be viewed through the lens of user-friendly 
assay design. 


Finding the right problems to solve 

The case study we have used (chemotaxis assays) helps to illustrate how 
competing technology platforms continue to improve over time as micro- 
fluidic technologies develop. Some of the touted advantages of micro- 
fluidic systems that existed 20 years ago are not as stark today because 
technological improvements have been made to more traditional and 
widely accepted assays, often narrowing the initially perceived perform- 
ance advantage of microfluidic solutions. This evolution in the techno- 
logy landscape highlights the need for finding the right problems in biology 
and medicine to solve with microfluidic approaches. For example, micro- 
fluidic solutions have advantages over many technologies for diagnostics 
in the developing world. However commercializing these technologies is 
challenging because, by definition, the desired diagnostic devices will not 
generate much revenue or profit. So the breadth and depth of impact may 
be great for this particular application, but a disconnect exists between 
development and commercialization. Likewise, there may be niche biolo- 
gical questions that can be addressed using microfluidic methods, but for 
which broad commercial markets do not exist. A key consideration in the 
development of new microfluidic methods in academic research should 
be whether the use of microfluidics introduces truly enabling function- 
ality compared to current methods. When a potential application passes 
this test, the chances of contributing useful technology to the field are 
substantially higher. 
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Alveolar progenitor and stem cells in lung 
development, renewal and cancer 


Tushar J. Desai, Douglas G. Brownfield' & Mark A. Krasnow! 


Alveoli are gas-exchange sacs lined by squamous alveolar type (AT) 1 cells and cuboidal, surfactant-secreting AT2 cells. 
Classical studies suggested that AT] arise from AT2 cells, but recent studies propose other sources. Here we use molecular 
markers, lineage tracing and clonal analysis to map alveolar progenitors throughout the mouse lifespan. We show that, 
during development, AT1 and AT? cells arise directly from a bipotent progenitor, whereas after birth new ATI cells derive 
from rare, self-renewing, long-lived, mature AT2 cells that produce slowly expanding clonal foci of alveolar renewal. 
This stem-cell function is broadly activated by AT1 injury, and AT2 self-renewal is selectively induced by EGFR (epider- 
mal growth factor receptor) ligands in vitro and oncogenic Kras(G12D) in vivo, efficiently generating multifocal, clonal 
adenomas. Thus, there is a switch after birth, when AT2 cells function as stem cells that contribute to alveolar renewal, 
repair and cancer. We propose that local signals regulate AT2 stem-cell activity: a signal transduced by EGFR-KRAS 
controls self-renewal and is hijacked during oncogenesis, whereas another signal controls reprogramming to AT1 fate. 


Pulmonary gas exchange occurs in delicate alveolar sacs lined by two 
epithelial cell types’ (Extended Data Fig. 1). Squamous alveolar type 
(AT) 1 cells mediate gas exchange, whereas cuboidal AT2 cells secrete 
surfactant that prevents alveolar collapse; AT2 cells are one of the med- 
ically most important cells in the neonate and, as described below, one 
of the most dangerous in adults. Serious diseases including respiratory 
distress syndrome and idiopathic pulmonary fibrosis involve a failure 
to establish or maintain AT1 and AT2 cells, and alveoli are a major site 
of lung cancer, the leading cause of cancer death’. 

Despite their importance, the identity of alveolar progenitor and 
stem cells is controversial and their activity throughout life uncharted**. 
Classical morphologic and autoradiographic studies in rodents sug- 
gested that progenitors mature into AT2 cells during development, 
some of which differentiate into AT1 cells°. Maintenance is difficult to 
study because of slow turnover®”, but lineage tag expression in isolated 
AT 1 cells is observed following bulk labelling of AT2 cell populations*’. 
To circumvent slow turnover, lung injury models have been used and 
provide evidence that AT2 cells can contribute to alveolar repair®'°”’. 
Currently, six different cell populations have been proposed as alve- 
olar stem cells on the basis of their capacity for clonal propagation and 
multilineage differentiation in culture®'*"*, Transplantation assays 
have also been used, but because many cells are implanted they cannot 
assess whether individual cells self-renew and undergo multilineage 
differentiation*’°. When one of these putative stem cell populations 
was fate-mapped in vivo, it failed to demonstrate the multilineage 
differentiation achieved in culture’®; similar disparity between ex vivo 
and in vivo behaviour of putative stem cells has been found for other 
organs’’. Here we use a battery of alveolar markers, lineage tracing and 
clonal analysis in mice to identify alveolar progenitor and stem cells 
in vivo and map their locations and activity during lung development, 
maintenance and cancer. 


AT1 and AT2 cells arise from a bipotent progenitor 

Mature AT1 and AT2 cells appear about 1 day before birth, when 
distal tubules begin to dilate (‘sacculation’, Fig. la-c)'*’°. We mapped 
progression of sacculation in three dimensions by analysing finely staged 


whole-mount lungs immunostained for E-cadherin (Cdh1) to visualize 
individual cells (Fig. la~c and Extended Data Fig. le-f). Dilation begins 
at the bronchoalveolar junction then progresses distally towards the 
airway tip (Fig. la—c). 

The classical model proposing that progenitors in development are 
pre-AT2 cells is difficult to reconcile with the finding that some AT1 
cell markers are expressed up to 5 days before sacculation”’. To mole- 
cularly classify progenitors, we validated 15 extant AT] and AT2 mar- 
kers (Supplementary Table 1) then analysed the transition in labelling 
between distal (progenitors) and proximal (nascent AT1 and AT2 cells) 
positions in a sacculating airway (Fig. 1d) to infer dynamic expression 
changes during differentiation (Fig. le-p). Markers fell into six expres- 
sion classes (Extended Data Table 1), distinguishing seven stages in alve- 
olar development (Fig. 1e-p). However, instead of a progenitor to AT2 
to AT1 progression, our data support a model in which bipotent pro- 
genitors (P) expressing a subset of AT1 (1) and AT2 (2) markers (Pl, 
P1", P2" and P2") give rise to either AT1 or AT2 cells by shutting off 
inappropriate cell type markers early (E) or late (L) in differentiation, 
then turning on cell type-specific late (L) markers (Al", A2") as they 
complete maturation (Fig. 1q). Co-expression of AT1 and AT2 mar- 
kers by progenitors indicates that these specialized cell types may have 
evolved from a primordial pneumocyte with features of both, similar to 
those in less derived vertebrates such as lungfishes”. 

Three additional lines of evidence support the bipotent progenitor 
model. First, clonal analysis of individual distal airway epithelial tip 
cells labelled on embryonic day (E) 15 using an inducible Cre recom- 
binase (linked with the oestrogen receptor (ER), Shh-Cre-ER) demon- 
strated localized alveolar lineage clusters with marked AT] and AT2 
cells (Extended Data Fig. 2a,b), confirming that individual cells are bipo- 
tent. Second, ultrastructural analysis of early sacculation revealed three 
classes of distal epithelial cells (Fig. 1r-u): cuboidal cells with glycogen 
vacuoles but no lamellar bodies (bipotent progenitors), cuboidal cells 
with vacuoles and lamellar bodies (early AT2 cells), and partially flattened 
cells with vacuoles (early AT1 cells). We never observed partially flat- 
tened cells with lamellar bodies, the presumed AT2-to-AT1 intermediate 
predicted by the classical model (Extended Data Fig. 1g). Third, lineage 
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Figure 1 | Development of alveolar type 1 (AT1) and AT2 cells from 
bipotent progenitors. a—c, Mouse lung lobe tips stained for E-cadherin 

(E, embryonic day; cr, crown-rump length in mm; PN, postnatal day). a’-c’, 
Sacculation (sac; asterisks) proceeds proximally (P) to distally (D) along the 
airway. Scale bars, 100 im (a-c), 20 jum (a’-c’). d, Staining for AT1 (Pdpn) and 
AT2 (SftpC) markers at E18.3 shows coexpression in pre-sacculation zone and 
restriction in late sacculation zone. Scale bar, 10 um. e-p, Tips imaged in pre- 
sacculation (e-j) and late sacculation (k-p) zones identify six classes of marker 
expression profiles (P1",P1!,A1!,p2®,p2" a2", Extended Data Table 1). Scale bar, 
20 um. q, Inferred differentiation pathways showing changes in AT1 (green) and 
AT2 (red) marker classes. Oval, lamellar body. r, s, Electron micrographs at 
E18.3 showing early (r) and late (s) sacculation zones. BP, bipotent progenitors; 
el, early AT1 cells; e2, early AT2 cells. Scale bar, 2 um. t, u, Boxed regions in 
S showing (e1) an early AT1 (squamous, glycogen vacuoles (GV) without lamellar 
bodies (LB)) and (e2) an early AT2 (cuboidal, GV, LB). N, nucleus; bar, 0.5 jim. 


tracing of newly differentiated AT2 cells using a Cre recombinase 
knock-in (LysM-Cre) at the lysozyme M (Lyz2) locus, a late AT2 mar- 
ker gene (A2°), labelled many AT2 cells in the embryo but no marked 
AT1 cells were seen by 2 weeks postnatally (Extended Data Fig. 2c, e). 

Little epithelial proliferation was detected during sacculation and 
for several weeks afterward (Extended Data Fig. 3), indicating that 
maturation of bipotent progenitors generates most or perhaps all AT1 
and AT2 cells in development. Flow cytometry at early sacculation (E18.1) 
showed bipotent progenitors or pre-AT2 cells (Mucl* Pdpn*) made 
up 8% of distal cells, whereas early or mature AT2 cells (Mucl* Pdpn_ ) 
made up 5%. Bipotent progenitors seem to be fully exhausted by post- 
natal day 4, when sacculation has completed throughout the lung. In 
adult lungs few if any bipotent progenitors were found by flow cyto- 
metry (0.3% + 0.2% Mucl* Pdpn” distal epithelial cells) or labelling 
with Shh-Cre-ER, suggesting a transition to other sources. 


Alveolar renewal foci by rare activation of AT2 cells 

Classical autoradiographic studies suggested alveolar epithelium is a slowly 
renewing population maintained by diffuse proliferation, presumably 
of AT2 cells*’. We used in vivo lineage tracing to investigate the role of 
AT2 cells in maintenance, marking them in two complementary ways, 


ARTICLE 


with similar results. In one, a Cre-ERT2-rtTA knock-in at surfactant 
protein C (SftpC) locus (SftpC-Cre-ER) was used with a membrane- 
localized fluorescent reporter (mTmG) to pulse-label AT2 cells by tamox- 
ifen induction at postnatal (PN) 18 days; mice were analysed 13 or 
192 days later. Although SftpC is often used as a mature AT2 marker, 
the results above (Extended Data Table 1) show that as a class P2® marker 
it is also expressed by the bipotent progenitor. Hence, we also labelled 
AT2 cells using LysM-Cre described above, a class A2" marker that 
initiates expression only in mature AT2 cells (Extended Data Fig. 4a, b, 
d, m). Both lines robustly labelled AT2 cells throughout the lung, and 
at 2 months after marking (SftpC-Cre-ER) or postnatally (LysM-Cre) 
showed scattered AT1 cells labelled with the AT2 lineage tag (Extended 
Data Fig. 2d, fand Extended Data Fig. 5). AT2 cells did not significantly 
contribute to maintenance of bronchiolar lineages including Clara, 
ciliated and neuroendocrine cells (Extended Data Fig. 4c, e-m), and 
although rare cells near bronchoalveolar junctions that co-expressed 
SftpC and Clara cell marker CCSP/Scgb1a1 were also tagged (“broncho- 
alveolar stem cells’ or BASCs"”), they proliferated little if at all (Extended 
Data Fig. 4f-h, j-m). 

To investigate the frequency and spatial localization of AT1 renewal 
by AT2 cells, we labelled lungs as above using LysM-Cre and analysed 
themat 1, 2, 4, 8 and 16 months of age. Less than 1% of AT1 cells expressed 
the AT2 lineage tag at 1 month, 3.9% at 4 months and 7.5% at 16 months 
(Fig. 2a-c). AT1 cell replacement occurred preferentially in alveoli 
abutting arterioles and in the lung periphery (Fig. 3a, b). Wherever 
new ATI cells arose, the alveolus was replaced essentially in half or its 
entirety, indicating alveoli include just one or two AT1 cells. With 
ageing, adjacent alveolar units were also renewed, evidenced by slowly 
enlarging clusters of AT1 cells marked with the AT2 lineage tag. These 
patches of AT1 replacement were indistinguishable from surrounding 
areas, except for AT1 labelling with the AT2 lineage tag; similar results 
were obtained with SftpC-Cre-ER (Extended Data Fig. 5). We conclude 
that AT1 replacement by AT2 cells occurs intermittently and in a spa- 
tially patchy distribution in ‘renewal foci’ that slowly enlarge over time, 
with perivascular and peripheral regions serving as relative ‘hot spots.’ 


Renewal foci derive from single, differentiated AT2 cells 
Each renewal focus could derive from a single AT2 cell, in which case 
the cells would be clonally related, or from multiple AT2 cells. To dis- 
tinguish these possibilities, we used a Confetti Cre-dependent reporter 
that stochastically expresses one of four fluorescent proteins in each 
cell that undergoes recombination. We analysed the membrane CFP 
lineage marker and found that foci were similar in size and progressive 
enlargement to those observed using the single-colour reporter, indi- 
cating that renewal foci derive froma single ‘founder’ AT2 cell. Because 
of fortuitously inefficient recombination of the Confetti reporter using 
LysM-Cre (~2% of AT2 cells expressed the membrane-targeted CFP 
lineage tag at 16 months), we were able to distinguish marked AT2 cells 
within renewal foci. Most contained one or two AT2 cells and multiple 
AT1 cell progeny colonizing up to six contiguous alveoli (Fig. 2d—-f). We 
also observed small clonal foci comprising only AT2 cells (Extended 
Data Fig. 6b, c), indicating they are capable of dividing without form- 
ing AT1 cells (self-duplication). Founder AT2 cells expressed mature AT2 
markers including LAMP- 1, a lysosome-associated protein, indicating 
they continue to produce surfactant (Extended Data Fig. 6a), and they 
did not express Clara, ciliated and neuroendocrine markers. Their 
only distinguishing feature was their nucleus, which was often slightly 
larger and stained more intensely for Nkx2.1 than nearby AT2 cells 
(see also Fig. 2d, e). We conclude that each renewal focus derives from 
a single, self-renewing AT2 cell that can generate multiple AT1 and/or 
AT2 cells, is long-lived, and remains closely associated with its pro- 
geny, as evidenced by its persistence in large foci at advanced ages 
(Fig. 2e). Because founder cells appear to be mature (LysM-Cre lineage 
positive), functional (SftpC-Cre-ER lineage and LAMP-1 positive) AT2 cells 
that share the essential properties of conventional stem cells (multipotent, 
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Figure 2 | Mature AT2 cells renew ATI cells in clonal foci. a—c, AT2 
lineage-labelled AT1 cell foci (green) enlarge with ageing, incorporating 
adjacent alveoli (asterisks). Dotted line, mesothelium. Scale bar, 150 tum. 

d, e, AT2 cells were sparsely marked with Confetti reporter, and mCFP-labelled 
foci (green) are shown co-stained for AT2 marker Nkx2.1 (red) at age 2 (d) and 
16 (e) months. Note ‘founder’ AT2 cell (arrow) and its labelled AT1 progeny 
(green). Numbers, incorporated alveoli. Entire clone schematized in right 
panel. Red, founder AT2; green lines, AT1 daughters visible (solid) or outside 
(dashed) focal plane; black, unlabelled AT1 (dotted lines) and AT2 (ovals) cells. 
Scale bar, 20 jim (d, e). f, Clone size increases with ageing (P = 0.03, Kruskal- 
Wallis test). n, clones scored; clone size, number of incorporated alveoli. 


self-renewing, persist for life), they are ‘bi-functional’ stem cells, 
executing both differentiated and regenerative functions. 


AT2 cell activation by acute AT1 cell injury 


Elevated oxygen tension is toxic to AT1 cells, but not to AT2 cells”. 
Exposure of 2 month old mice carrying the AT2 lineage tag to 88% 
oxygen for 120h tripled the number of renewed AT1 cells (Fig. 3c-e). 
This finding shows that AT2 cells become activated following hyper- 
oxic injury, and supports the idea that although normally only a rare 
subset of AT2 cells executes a stem cell function, others can be recruited 
to repair alveolar damage. Whether every AT2 cell can be activated this 
way could not be determined because more severe hyperoxia was lethal. 


C Centrol (room air) 
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Purified AT2 
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Figure 3 | Activation of AT2 stem cell function in vivo and proliferation 
in vitro. a,b, Renewal foci (AT2 lineage label, green; other cells, red) 
commonly involve alveoli (asterisks) in peripheral (a, mesothelium, dots) 
and perivascular (b, dashes) domains. Scale bar, 50 jum. c-e, Alveolar regions 
under room air (c) or after hyperoxia (88% O., 5 days) to injure 

ATI cells (d). Note increased foci (dashed ovals) after injury. Scale bar, 50 jm. 
Quantification (e) shows increased alveolar surface (n=2; mean + s.e.m.) from 
AT2-lineage labelled cells after hyperoxia. P< 0.05 (Mann-Whitney U test). 
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Oncogenic Kras selectively activates AT2 self-renewal 


Adenocarcinoma, the major form of lung cancer, is associated with 
activating mutations in Kras or Egfr’*. Typically located in peripheral 
lung regions, nearly all tumour cells express SftpC, leading to long- 
standing speculation that they originate from AT2 cells or their pro- 
genitors. However, the identity of the tumour-initiating cell(s) remains 
controversial’**°*. To test the effect of oncogenic Kras(G12D) on mature 
AT2 cells, a conditional Kras“""!”” allele (Kras™ SL-G12D ie 4 knock-in at 
the Kras locus in which the wild-type Kras coding sequence is replaced 
by a lox-STOP-lox-Kras(G12D)) was activated using LysM-Cre along 
with the mTmG Cre-dependent reporter. Tumour nodules grew rapidly 
throughout the lungs (Fig. 4a—d), with dense replacement of virtually 
the entire alveolar region by 1 month after induction and death shortly 
thereafter (Fig. 4e). When lungs were examined in the first few days 
following induction, we found with a Rainbow multi-colour reporter 
that nearly every epithelial cell expressing the AT2 lineage tag prolifer- 
ated, demonstrating highly efficient AT2 transformation by Kras(G12D) 
(Fig. 4f-i). The biggest tumours were found in peripheral and peri- 
vascular regions, sites where physiological AT1 renewal by AT2 cells 
was commonly observed (compare Figs 3a, b and 40, p). Similar results 
were obtained using SftpC-—Cre-ER to activate Kras(G12D) in adult mice 
(Extended Data Fig. 7a, c). By contrast, when we used CCSP-Cre-ER, 
most Clara cells were unaffected or divided minimally, whereas at bron- 
choalveolar junctions some formed small clonal adenomas (Extended 
Data Fig. 7b). We also used ubiquitously expressed ROSA-Cre-ER to 
activate the Kras(G12D) allele at random, resulting in many singlets 
and minimally affected cells throughout the lung, even 18 days after 
induction (Fig. 4j). The transformed AT2 cells comprising the adeno- 
mas continued to express AT2 markers (Nkx2.1, SftpD) and did not 
turn on a Clara marker (CCSP) or, with rare exceptions, AT1 markers 
(Pdpn, LEL) (Extended Data Fig. 8). Thus, oncogenic Kras(G12D) 
seems to selectively and permanently induce AT2 self-renewal, without 
deprogramming the cells to the bipotent progenitor or causing repro- 
gramming to AT1 or Clara cell fates. 

By examining lineage-tagged Kras(G12D) mutant lungs at progres- 
sive stages, we could infer the cellular mechanism of adenoma formation 
(Fig. 4k-m). Proliferation of the activated AT2 cell generates daughter 
cells that spread laterally yet maintain a monolayer (‘lepidic’ expan- 
sion), the first histological sign of the tumour (Fig. 41). Later, cells heap 
up and form a nodule (‘hilical’ expansion) that obliterates the lumen and 
begins compressing and invading adjacent alveoli (Fig. 4m). Infrequent 
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f, g, Freshly isolated AT2 cells (f, phase contrast) cultured 4 days in Matrigel 
(g) proliferate, shown by Ki67 staining (red, asterisks), but maintain AT2 
marker expression (SftpC, green). h-j, Images (h, i) and quantification 

(500 cells per biological replicate, n = 4, 3, 3, 3, 4) (j) of proliferation with EGFR 
blocking antibody (anti-EGFR, 2.5 ug ml’) and EGF ligands indicated (4 1M). 
Scale bar, 10 tum (f-i). Mean + s.e.m.; *P < 0.05; **P < 0.01; ***P < 0.001 
(Tukey’s multiple comparisons test). 
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Figure 4 | Transformation of mature AT2 cells by activated Kras. 

a-c, LysM-Cre > mTmG control (a) and LysM-Cre > mTmG,KrasS’G?P/* 
(abbreviated Kras*) (b, c) lungs at age 7 weeks, showing proliferated AT2 cells 
(b, green cells) compressing surrounding cells (red) and forming adenomas 
(c, haematoxylin and eosin stain). Scale bar, 20 j1m. d, Lung lobe as in b showing 
widespread infiltration by tumour (green). Inset, control lobe. Scale bar, 1 mm. 
e, Survival curves. f-i, LysM-Cre > Rainbow,Kras* lungs at indicated ages 
(PN, postnatal day). Note rapid clonal (single colour) expansion of labelled 
AT2 cells with minimal cell mixing. Scale bar, 50 um (f-h), 100 jm (i). 


clones developed into large, well-formed papillary structures, although 
most formed simple adenomas”. At advanced stages, a notable pattern 
of large, closely packed, single-colour tumours was evident, showing 
minimal cellular exchange between neighbouring tumour foci (Fig. 4i). 
Some advanced tumours became infiltrated by macrophage ‘nests’ 
(Fig. 4n), a poor prognostic sign”®. 


EGFR activity controls AT2 self-renewal in culture 

We profiled the transcriptome of bipotent progenitors and AT2 cells. 
Both express Egfr, two other EGFR family members (Erbb2 and Erbb3) 
(Supplementary Table 2), and receptors for many other signals (Sup- 
plementary Tables 2 and 3). Purified AT2 cells (Fig. 3f) retain a robust 
ability to both proliferate (>25% Ki67-positive, Fig. 3g, j) and to differ- 
entiate into AT 1-like cells (>95% thin, flat morphology and aquaporin 
5-positive, n = 240 cells) (Extended Data Fig. 6d). Addition of purified 
EGF ligands (transforming growth factor « (TGF-«), heparin-binding 
EGF (HB-EGF), NRG1) stimulated AT2 proliferation, whereas EGFR- 
blocking antibody inhibited proliferation (Fig. 3h-j). None of these 
treatments induced differentiation into AT1 cells. EGF signalling is there- 
fore a critical and selective regulator of AT2 proliferation, at least under 
these conditions. 


Discussion 


Our characterization of alveolar progenitor and stem cells throughout 
the lifespan supports a model in which AT1 and AT2 cells arise inde- 
pendently during development from a bipotent progenitor (Fig. 5a). 
Several weeks after birth, when alveolar development is complete, there 
is a switch and mature AT2 cells become a renewable source of AT1 
and AT2 cells. But these postnatal renewal events are rare and occur in 
monoclonal foci (‘alveolar renewal focus’, Fig. 5b) that slowly expand 
over months or years. Only a small fraction (~1%) of mature AT2 cells 
normally express this stem cell function and they divide intermittently 
(~40-day doubling time) and supply only local domains, giving an over- 
all renewal rate of just 7% of alveoli per year. This stem cell function is 
more broadly induced by AT1 injury, indicating that other AT2 cells 
may have similar potential (‘alveolar repair focus’, Fig. 5b) that can be 
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k-n, A recombined (green) AT2 cell (k) generates progeny that spread laterally, 
giving ‘pearl bracelet’ appearance (1). Cells later ‘heap up’ into nodules (m), 
some infiltrated by macrophages (n; red, anti-F 4/80). Scale bar, 10 um. 

0, p, Haematoxylin and eosin stain shows robust tumour growth peripherally 
and perivascularly (P-V). Boxed area (0, close-up in p) shows proliferated cells 
around blood vessels (v). Scale bar, 200 lum. 


activated by dying AT1 cells. Classical pneumonectomy experiments 
support the idea that most AT2 cells possess latent regenerative capacity”’, 
and AT2 ablation induces self-duplication of surviving AT2 cells? 
(‘AT2 replacement focus’). Our data do not exclude alternative sources 
of new alveolar cells, especially following severe injuries’* that deplete 
AT2 cells regionally. 

If many or all AT2 cells can serve as stem cells, why does only a minor- 
ity execute this function for maintenance, producing large monoclonal 
foci? Perhaps alveolar turnover is coupled with a stem cell hierarchy 
whereby an initially activated AT2 cell suppresses nearby AT2 cells and 
becomes the dominant stem cell. It is also unclear why renewal foci 
slowly enlarge over time, because there is no obvious recurrent injury. 
Perhaps foci are programmed anatomical domains of alveolar renewal, 
with new cells moving out from a specialized niche akin to intestinal 
crypts, albeit with much slower turnover. Whatever the explanation, 
there are clearly regional influences as both renewal and tumour foci 
are preferentially located in perivascular and peripheral lung domains, 
possible sources of stem cell signals”. 

EGFR signalling selectively stimulates proliferation of AT2 cells in 
vitro, and oncogenic Kras(G12D) permanently and selectively activates 
proliferation in vivo, efficiently transforming AT2 cells into rapidly 
growing monoclonal tumours (‘lung tumour focus’, Fig. 5b). By virtue 
of their large numbers, class susceptibility and robust adenomatous 
response to Kras(G12D), AT2 cells may constitute the major cell type 
responsible for human lung adenocarcinoma, making them among the 
most dangerous cells in the body. 

We propose that the oncogenic potential of AT2 cells is a direct 
consequence of their stem cell function: AT2 cells are poised to func- 
tion as alveolar stem cells and EGFR/KRAS signalling regulates the 
self-renewal part of the stem cell program (and the related process of 
self-duplication); another signalling pathway (Supplementary Table 2) 
must control AT2 reprogramming to AT1 fate (Fig. 5a). This model 
predicts that dying AT1 cells secrete an EGF that initiates self-renewal, 
plus another signal for fate reprogramming. Dying AT2 cells presumably 
produce only the former. It will be important to identify these signals 
and the events they control. This could suggest new strategies for early 
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Figure 5 | Model of alveolar progenitors and stem cells in development, 
maintenance, and cancer. a, Bipotent progenitors expressing some AT1 
(green) and AT2 (red) markers differentiate into AT1 or AT2 cells. Mature AT2 
cells function as stem cells intermittently activated for alveolar renewal and 
repair. Dying AT1 cells are proposed to produce a signal (S1) transduced by 
EGEFR-KRAS that activates division of a nearby AT2 cell (self-renewal); another 
signal (S2) reprograms a daughter into an AT1 cell. Activating mutations of 
Egfr or Kras in AT2 cells drive constitutive self-duplication, forming tumour of 
AT2-like cells. b, Rare AT2 cells function as stem cells, giving rise to clonal 
renewal foci (red, left) that slowly enlarge, with persistence of founder AT2 
cell (F1). With injury, additional AT2 cells (F2) are recruited to generate repair 
foci (red, right). Activating Kras mutation in AT2 cells (F3) initiates tumour 
focus (red, top). 
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METHODS SUMMARY 
LSL-G12D 


Cre recombinase was used to activate Rosa26-reporters*? and a Kras 
knock-in. For immunostaining, lungs were agarose-inflated, then fixed in para- 
formaldehyde (PFA) and cryo-embedded or vibratome-sliced and fixed in 
methanol:dimethylsulphoxide or PFA. Secondary antibodies were Alexa Fluor- 
conjugated, or horseradish peroxidase-conjugated with tyramide amplification. 
Specimens were imaged by confocal (slices) or epifluorescence (cryosections) 
microscopy. For cell purification, lungs were dissociated then sorted by fluor- 
escent markers or surface antigens using FACS or MACS. AT2 cells were cultured 
on Matrigel or on glass with serum. Expression profiling used Affymetrix platform. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 


Received 3 December 2012; accepted 3 December 2013. 
Published online 5 February 2014, 


1. Bertalanffy, F. D. & Leblond, C. P. Structure of respiratory tissue. Lancet 266, 
1365-1368 (1955). 

2. Siegel, R., Naishadham, D. & Jemal, A. Cancer statistics, 2013. CA Cancer J. Clin. 63, 
11-30 (2013). 

3. Rock, J. R. & Hogan, B. L. Epithelial progenitor cells in lung development, 
maintenance, repair, and disease. Annu. Rev. Cell Dev. Biol. 27, 493-512 (2011). 

4. Chapman, H.A. et al. Integrin «64 identifies an adult distal lung epithelial 
population with regenerative potential in mice. J. Clin. Invest. 121, 2855-2862 
(2011). 

5. Adamson, |. Y. & Bowden, D. H. Derivation of type 1 epithelium from type 2 cells in 
the developing rat lung. Lab. Invest. 32, 736-745 (1975). 

6. Spencer, H. & Shorter, R. G. Cell turnover in pulmonary tissues. Nature 194, 880 
(1962). 

7. Evans, M. J. & Bils, R. F. Identification of cells labeled with tritiated thymidine in the 
pulmonary alveolar walls of the mouse. Am. Rev. Respir. Dis. 100, 372-378 (1969). 

8. Rock, J. R. et al. Multiple stromal populations contribute to pulmonary fibrosis 
without evidence for epithelial to mesenchymal transition. Proc. Natl Acad. Sci. USA 
108, £1475-E1483 (2011). 

9. Barkauskas, C. E. et a/. Type 2 alveolar cells are stem cells in adult lung. J. Clin. 
Invest. 123, 3025-3036 (2013). 

10. Evans, M. J., Cabral, L. J., Stephens, R. J. & Freeman, G. Renewal of alveolar 
epithelium in the rat following exposure to NOz. Am. J. Pathol. 70, 175-198 (1973). 


194 | NATURE | VOL 507 | 13 MARCH 2014 


11. Adamson, |. Y. & Bowden, D. H. The type 2 cell as progenitor of alveolar epithelial 
regeneration. A cytodynamic study in mice after exposure to oxygen. Lab. Invest. 
30, 35-42 (1974). 

12. Kim,C.F. etal. Identification of bronchioalveolar stem cells in normal lung and lung 
cancer. Cell 121, 823-835 (2005). 

13. McQualter, J. L., Yuen, K., Williams, B. & Bertoncello, |. Evidence of an epithelial 
stem/progenitor cell hierarchy in the adult mouse lung. Proc. Nat! Acad. Sci. USA 
107, 1414-1419 (2010). 

14. Kumar, P. A. et al. Distal airway stem cells yield alveoli in vitro and during lung 
regeneration following H1N1 influenza infection. Cel! 147, 525-538 (2011). 

15. Kajstura, J. et a/. Evidence for human lung stem cells. N. Engl. J. Med. 364, 
1795-1806 (2011). 

16. Rawlins, E. L. etal. The role of Scgb1al1* Clara cells in the long-term maintenance 
and repair of lung airway, but not alveolar, epithelium. Cell Stem Cell 4, 525-534 
(2009). 

17. Van Keymeulen, A. et al. Distinct stem cells contribute to mammary gland 
development and maintenance. Nature 479, 189-193 (2011). 

18. Burri, P. H. & Moschopulos, M. Structural analysis of fetal rat lung development. 
Anat. Rec. 234, 399-418 (1992). 

19. Buckingham, S., McNary, W. F., Jr & Sommers, S. C. Pulmonary alveolar cell 
inclusions: their development in the rat. Science 145, 1192-1193 (1964). 

20. Williams, M. C. & Dobbs, L. G. Expression of cell-specific markers for alveolar 
epithelium in fetal rat lung. Am. J. Respir. Cell Mol. Biol. 2, 533-542 (1990). 

21. Hughes, G. M. Ultrastructure of the lung of Neoceratodus and Lepidosiren in relation 
to the lung of other vertebrates. Folia Morphol. (Praha) 21, 155-161 (1973). 

22. Miller,L.A., Wert, S.E. & Whitsett, J. A. Immunolocalization of sonic hedgehog (Shh) 
in developing mouse lung. J. Histochem. Cytochem. 49, 1593-1603 (2001). 

23. Messier, B. & Leblond, C. P. Cell proliferation and migration as revealed by 
radioautography after injection of thymidine-H® into male rats and mice. Am. 

J. Anat. 106, 247-285 (1960). 

24. Bowden, D. H., Adamson, I. Y. & Wyatt, J. P. Reaction of the lung cells to a high 
concentration of oxygen. Arch. Pathol. 86, 671-675 (1968). 

25. Herbst, R.S., Heymach, J. V. & Lippman, S. M. Lung cancer. N. Engl. J. Med. 359, 
1367-1380 (2008). 

26. Sutherland, K. D. & Berns, A. Cell of origin of lung cancer. Mol. Oncol. 4, 397-403 
(2010). 

27. Xu, X. et al. Evidence for type Il cells as cells of origin of K-Ras-induced distal lung 
adenocarcinoma. Proc. Nat! Acad. Sci. USA 109, 4910-4915 (2012). 

28. Lin, C. et al. Alveolar type Il cells possess the capability of initiating lung tumor 
development. PLoS ONE 7, e53817 (2012). 

29. Nikitin, A. Y. et al. Classification of proliferative pulmonary lesions of the mouse: 
recommendations of the mouse models of human cancers consortium. Cancer 
Res. 64, 2307-2316 (2004). 

30. Takanami, |., Takeuchi, K. & Kodaira, S. Tumor-associated macrophage infiltration 
in pulmonary adenocarcinoma: association with angiogenesis and poor 
prognosis. Oncology 57, 138-142 (1999). 

31. Brody, J. S., Burki, R. & Kaplan, N. Deoxyribonucleic acid synthesis in lung cells 
during compensatory lung growth after pneumonectomy. Am. Rev. Respir. Dis. 
117, 307-316 (1978). 

32. Ding, B. S. et a/. Endothelial-derived angiocrine signals induce and sustain 
regenerative lung alveolarization. Cell 147, 539-553 (2011). 

33. Kretzschmar, K. & Watt, F. M. Lineage tracing. Cell 148, 33-45 (2012). 

34. Jackson, E. L. et al. Analysis of lung tumor initiation and progression using 
conditional expression of oncogenic K-ras. Genes Dev. 15, 3243-3248 (2001). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank A. Andalon for technical assistance; H. Chapman 
(SftpC-Cre-ER-ntTA), B. Hogan (CCSP-Cre-ER), H. Ueno and |. Weissman (Rainbow), 
H. Clevers (Confetti), L. Luo (mTmG), and J. Sage (Kras‘S“-512) for strains; B. Stripp for 
goat anti-CCSP antibody; F. H. Espinoza for annotated gene lists; R. Metzger, 

H. Chapman, and members of the Krasnow laboratory for discussions and comments 
on the manuscript; and Maria Petersen for help preparing figures and the manuscript. 


Author Contributions T.J.D. conducted the experiments except the gene expression 
profiling and AT2 cell cultures, which were done by D.G.B.; TJ.D., D.G.B. and M.A.K. 
conceived the experiments, analysed the data and wrote the manuscript. This work was 
supported by a Parker B. Francis Foundation Fellowship and NIH 5KO8HLO84095 
Award (T.J.D.), NIH T32HD007249 (D.G.B.), and an NHLBI U01HLO99995 Progenitor 
Cell Biology Consortium grant (M.A.K.). M.A.K. is an investigator of the Howard Hughes 
Medical Institute. 


Author Information Microarray datasets were deposited at Gene Expression Omnibus 
(accession code GSE49346) and GEXC (https://gexc.stanford.edu/population/detail/ 
998 and https://gexc.stanford.edu/population/detail/999). Reprints and permissions 
information is available at www.nature.com/reprints. The authors declare no 
competing financial interests. Readers are welcome to comment on the online version 
of the paper. Correspondence and requests for materials should be addressed to 
M.A.K. (krasnow@stanford.edu) or T.J.D. (tdesai@stanford.edu). 


©2014 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature13124 


C9orf72 nucleotide repeat structures 
initiate molecular cascades of disease 
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Nicholas J. Maragakis*, Juan C. Troncoso®, Akhilesh Pandey’, Rita Sattler?**+, Jeffrey D. Rothstein? ?:+ & Jiou Wang)? 


A hexanucleotide repeat expansion (HRE), (GGGGCC),, in C9orf72 is the most common genetic cause of the neurodegen- 
erative diseases amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). Here we identify a molecular 
mechanism by which structural polymorphism of the HRE leads to ALS/FTD pathology and defects. The HRE forms DNA 
and RNA G-quadruplexes with distinct structures and promotes RNA+DNA hybrids (R-loops). The structural polymor- 
phism causes a repeat-length-dependent accumulation of transcripts aborted in the HRE region. These transcribed repeats 
bind to ribonucleoproteins in a conformation- dependent manner. Specifically, nucleolin, an essential nucleolar protein, 
preferentially binds the HRE G-quadruplex, and patient cells show evidence of nucleolar stress. Our results demonstrate 
that distinct C9orf72 HRE structural polymorphism at both DNA and RNA levels initiates molecular cascades leading to 
ALS/FTD pathologies, and provide the basis for a mechanistic model for repeat-associated neurodegenerative diseases. 


Nucleotide repeat elements, including microsatellites and short tandem 
repeats, are common in eukaryotic genomes’. Recently, a hexanucleo- 
tide repeat expansion (HRE), (GGGGCC),, in a noncoding region of 
C9orf72 was linked to the neurodegenerative diseases amyotrophic 
lateral sclerosis (ALS) and frontotemporal dementia (FTD)*”. ALS is 
characterized by a loss of motor neurons, with 90% of ALS cases being 
sporadic and the other ~ 10% having a family history’; the C90rf72 HRE 
represents the most common genetic cause of both familial and spora- 
dic ALS®. FTD is characterized by degeneration of the frontal and tem- 
poral lobes of the brain and is the second most common type of dementia 
in people younger than 65°. Again, the C9orf72 HRE is one of the most 
common genetic causes of FTD®. Increasing evidence suggests that ALS 
and FTD are two related diseases in a continuous clinical spectrum’, 
and there is a possibility that the C9orf72 HRE also contributes to 
Alzheimer’s and Huntington’s diseases*”. 

Normal human C9orf72 alleles have 2 to 25 intronic GGGGCC repeats, 
with the majority having fewer than eight repeats and more than half 
having two repeats’. The expanded repeats associated with ALS/FTD 
are thought to have variable lengths, ranging from tens to thousands of 
hexanucleotide repeats”’, but correlations between the repeat lengths 
and clinical onset or progression have yet to be established. Although a 
molecular understanding of C90rf72 HRE pathological phenotypes has 
begun to emerge, the mechanisms by which the GGGGCC repeat expan- 
sion causes ALS/FTD pathology are unknown and our understanding of 
nucleotide repeats in the context of human disease is still in its infancy’? ”. 

Here we report that the C90rf72 HRE DNA/RNA sequence is struc- 
turally polymorphic; it can fold into stable G-quadruplex secondary 
structures and form transcriptionally induced RNA*DNA hybrids known 
as R-loops. The DNA of the C9orf72 HRE forms both antiparallel- and 
parallel-stranded G-quadruplexes, whereas the RNA adopts only parallel- 
stranded G-quadruplex conformations. These structural features of the 
C9orf72 HRE lead to truncated HRE-containing abortive transcripts. 
We identified ribonucleoproteins bound to the repeat RNA in an RNA 
conformation-dependent manner. Nucleolin (NCL), an essential nucle- 
olar protein that binds specifically to C9orf72 HRE G-quadruplexes, is 


mislocalized in patient cells carrying the mutation. Accordingly, nucle- 
olar function is impaired in patient cells. Furthermore, this nucleolar 
pathology can be recapitulated by introducing C9orf72 HRE-containing 
abortive transcripts into wild-type cells. These results point to the struc- 
tural conformations of both the DNA and RNA hexanucleotide repeats 
as fundamental determinants of the pathogenic mechanisms of C9orf72 
HRE-linked ALS/FTD. 


G-quadruplexes formed by HRE DNA/RNA 

To understand how expansion of the GGGGCC repeat in the C9orf72 
gene impedes cellular functions and leads to the associated diseases, we 
began by examining the repeat sequence for unique structural charac- 
teristics. The C90rf72 HRE GGGGCC DNA repeat sequence has prop- 
erties that would allow it to form G-quadruplexes, which are stacks of 
planar tetramers consisting of four guanines connected by Hoogsteen 
hydrogen bonds**. To look for such quadruplexes we first examined 
the secondary structures formed by the GGGGCC hexanucleotide 
repeats using circular dichroism (CD) absorptivity. The (GGGGCC), 
DNA shows a characteristic spectrum for antiparallel G-quadruplexes 
in 100 mM KCI’, with a maximum absorbance at 295 nm and a mini- 
mum at 260 nm (Fig. 1a). This conformational change at physiologic- 
ally relevant levels of KCl (100 mM) is also apparent in the decreased 
mobility of the DNA in a native polyacrylamide gel electrophoresis 
(PAGE) shift assay (Fig. 1b). 

To provide further conformational insight into the DNA G-quadruplex 
formed by (GGGGCC),, we used dimethyl sulphate (DMS) footprint- 
ing following classical Maxam and Gilbert sequencing methods”. On 
the sequencing gel (Fig. 1c), the band pattern resulting from guanine 
depurination is markedly reduced in the presence of KCl, indicating 
that the guanine N7 positions are protected from DMS methylation and 
subsequent base cleavage as a result of hydrogen bonding in a G-quartet 
conformation. These results provide evidence that the (GGGGCC), 
DNA forms a four-stack antiparallel G-quadruplex motif, with the 
guanines on the exterior of the stacks being more accessible to chemical 
modifications than the guanines buried in the interior (Fig. 1d). Through 
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Figure 1 | DNA and RNA of the C9orf72 HRE form G-quadruplexes. a, The 
CD absorptivity shows characteristic K*-dependent spectra for antiparallel 
DNA G-quadruplexes with (GGGGCC),. b, The presence of K* during 
annealing induces a conformational change that decreases the mobility of 
(GGGGCC), DNA ina gel mobility shift assay. c, A DMS footprinting assay on 
(GGGGCC), DNA shows protection of the N7 positions on all the guanines 
when a G-quadruplex is formed. d, The proposed topology for the antiparallel 
DNA G-quadruplex formed by (GGGGCC),. Each grey plane represents a 
G-quartet, as shown in the lower corner. Four separate G-quartets are stacked 


a series of spectroscopic characterizations, we observed that the GGGGCC 
sequence can adopt intermolecular, intramolecular, and also parallel 
G-quadruplex structures when the number of repeats is varied (Extended 
Data Fig. la-f and Supplementary Results). However, the antiparallel 
G-quadruplex seems to be the most dominant and the most stable con- 
formation at physiologically relevant conditions. 

We next investigated the structural conformation and stability of the 
C9orf72 RNA HRE sequences. Through a series of experiments measuring 
CD absorptivity, the (GGGGCC), RNA shows a signature spectrum 
for parallel G-quadruplexes”’, with a minimum absorbance at 240 nm 
and a maximum at 260 nm (Fig. le), which is consistent with recent 
results for similar RNA HRE sequences'*"®. When the (GGGGCC), RNA 
was examined by native PAGE, its mobility is decreased in the presence 
of 100 mM KCl, in agreement with a Kt -dependent formation of G- 
quadruplexes (Fig. 1f). 

To provide further conformational insight, we performed an RNase 
protection assay. The RNA digestion pattern for endonuclease RNase T1, 
which cleaves single-stranded RNAs at the 3’ end of guanine residues, 
shows clear structural differences in the presence and absence of KCL. 
Without KCl, the (GGGGCC), RNA forms a secondary structure con- 
sistent with single-stranded bulges and hairpin regions as indicated by 
the RNase protection pattern (Fig. 1g). In contrast, in the presence of 
100 mM KCl the RNA adopts a G-quadruplex structure, in which every 
fourth guanine is highly sensitive to single-stranded cleavage. This 
provides evidence for a topology of a three-stacked parallel-stranded 
G-quadruplex with a guanine and two cytosines in the single-stranded 
loop region (Fig. 1h). Together, these data indicate that the C9orf72 
HRE RNA preferentially adopts a parallel G-quadruplex topology and 
is stable at physiological KCl concentrations, pH and temperature 
(Extended Data Fig. 1g). 


Impeded transcription and abortive RNAs 


To understand the functional consequence of structural polymorphism 
at the C90rf72 HRE, we examined transcription of the repeat region. 
First, we established an in vitro transcription assay by creating GGGGCC 
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5' to 3’ with two cytosines forming each loop region. e, The CD spectra 
identified the formation of a parallel G-quadruplex for the RNA (GGGGCC), 
in the presence of 100 mM KCI. f, (GGGGCC)4 RNA demonstrates a slower 
mobility in the presence of KCl compared to (CCCCGG),4 RNA. g, The RNase 
T1 protection assay identifies single-stranded guanine residues (denoted in 
black) not involved in the formation of the G-quadruplex. h, The proposed 
parallel G-quadruplex topology formed by the RNA of the C9orf72 HRE. Three 
stacks form the parallel G-quadruplex, with the RNase T1-sensitive guanines 
shown as black dots. 


hexanucleotide repeats of varying lengths (3-70 repeats) de novo via 
repeat-primed PCR, and placing the repeats downstream of a T7 pro- 
moter as transcription templates. When in vitro transcriptional pro- 
ducts are resolved by denaturing PAGE periodic abortive transcripts 
are observed, with longer repeats producing increasing amounts of trun- 
cated transcripts that correspond to the template hexanucleotide repeats 
(Fig. 2a). Importantly, the increase in the abortive transcripts causes a 
concomitant decrease in full-length transcripts. To quantify this obser- 
vation, we compared the levels of full-length transcripts, which contain 
regions downstream (3’) of the repeats, with the total levels of all tran- 
scripts, which contain regions upstream (5’) of the repeats (Fig. 2a). The 
plot of this 3'/5’ ratio shows that there is a transition to more abortive 
transcripts relative to full-length products with increasing repeats (Fig. 2b). 
Furthermore, we verified that the repeat templates were not extensively 
modified by depurination of the DNA sequence, excluding this poten- 
tial modification as the cause of the observed abortive transcription 
(Extended Data Fig. 2). Therefore, these in vitro results demonstrate 
that the C90rf72 HRE causes RNA polymerase processivity to be impaired 
in the repeat region, leading to an accumulation of repeat-containing 
abortive transcripts and loss of full-length transcripts. 

Next, we investigated the molecular mechanisms underlying abort- 
ive RNA transcript production. We examined the structural features 
unique to the C9o0rf72 HRE region on the plasmid DNA and found 
that, consistent with data on the DNA oligonucleotides in Fig. 1, the 
plasmid also forms stable G-quadruplexes in the presence of KCl, which 
was directly assessed by the preferential binding of the G-quadruplex- 
specific BG4 nanobody”* to the DNA G-quadruplex structure (Extended 
Data Fig. 3a, b). The formation of these G-quadruplexes leads to tran- 
scripts being aborted earlier and to a further decrease in full-length 
transcripts in the in vitro transcription assay; these repeat-containing 
abortive transcripts accumulate over time (Extended Data Fig. 3c, d). 
These results indicate that the formation of G-quadruplexes impairs 
polymerase processivity within the C9orf72 HRE region. 

Next, we examined the nascent RNA transcript to look for possible 
mechanistic contributions to abortive transcription. First, using a 
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Figure 2 | Abortive transcription in the C9orf72 HRE. a, Increasing lengths 
of GGGGCC repeats cause accumulation of abortive transcripts in a length- 
dependent manner in vitro. The transcriptional products were separated on a 
denaturing gel with a 500-nucleotide ssRNA control (CTL). b, Transcript levels 
shown in a were densitometrically quantified and then plotted as the ratio of 
full-length transcripts that contain regions 3’ of the repeat to all transcripts that 
contain 5’ regions. The curve was fit to a single exponential. Data are 

means + s.d. n = 4. c, The C9orf72 HRE induces the formation of R-loops on 
C9orf72 HRE-containing plasmids with (GGGGCC)_79. Treatment of the 

in vitro transcription products with RNase A and H digests the RNA still 
hybridized with relaxed or supercoiled plasmid and reduces the smearing that 
was caused by the size heterogeneity of RNA*DNA hybrids. Genomic DNA 
(top band) serves as an internal loading control. d, Patients carrying the C9orf72 
HRE have reduced pre-mRNA 3’/5’ratios relative to C9orf72 wild type (WT), 
consistent with the HRE-induced abortive transcription reducing full-length 
transcript levels. Data are means + s.e.m. n = 5/6 (B lymphocytes), n = 12/10 
(motor cortex), n = 8/5 (spinal cord) for C90rf72 WT/HRE samples, 
respectively. ***P < 0.001, **P < 0.01, *P< 0.05. 


G-quadruplexehaemin complex-catalysed peroxidase-like colorimet- 
ric assay’, we demonstrated that the repeat-containing transcripts 
also form RNA G-quadruplexes, but the formation of these structures 
on nascent RNA during transcription has negligible effects on the 
decrease in polymerase processivity (Extended Data Fig. 4a, b and 
Supplementary Results). However, we noted that treatment with 
RNase H, which specifically digests RNA in an RNA*DNA hybrid, 
decreases the accumulation of abortive transcripts and increases 
longer transcripts during the assay (Extended Data Fig. 4c). To test 
whether nascent RNA*DNA template hybrids (R-loops) form on the 
GGGGCC repeats, we performed the in vitro transcription assay, and 
then treated the samples with RNase H and RNase A to remove the 
RNA in this hybrid state and the excess single-stranded RNA, respec- 
tively. In the absence of the RNases, the mobility of the plasmid decreases, 
and consistent with the heterogeneity of their sizes as a result of R-loop 
formation, the plasmids migrate as a smear above the previously com- 
pact supercoiled/circular plasmid band (Fig. 2c and Supplementary 
Results). However, upon simultaneous treatment with RNase H and 
RNase A, the plasmids then migrate as distinct compact bands. The 
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control plasmid shows a minimally altered mobility when compared 
to the HRE-containing plasmid after treatment with RNase H and/or 
RNase A (Extended Data Fig. 4d, e). Together, these results provide 
evidence that the formation of R-loops is an additional structurally 
polymorphic feature of the C9orf72 HRE that contributes to the abort- 
ive transcription mechanism. 

To further extend our observations concerning in vitro abortive 
transcription to possible truncation of the endogenous C9orf72 tran- 
script, we examined the production of pre-messenger RNA transcripts 
at different positions along the C9orf72 gene in patient cells. Primers 
targeting intronic regions immediately upstream (5’) of the HRE ver- 
sus downstream (3’) of the HRE region were used with quantitative 
reverse transcription polymerase chain reactions (qRT-PCR) to mea- 
sure pre-mRNA levels. These results indicate that patient B lympho- 
cytes carrying the C9orf72 HRE show a significant decrease in the ratio 
of 3'/5’ pre-mRNA levels relative to the controls (Fig. 2d). This obser- 
vation was extended to pathologically relevant tissue from patients such 
as the motor cortex and cervical spinal cord. Quantitative measure- 
ments of pre-mRNA levels using NanoString technology” show a sim- 
ilar decrease in the ratio of 3’/5’ pre-mRNA levels in patients carrying 
the HRE. These results indicate that the endogenous C9orf72 HRE in 
patients induces abortive transcription and results in unequal tran- 
scriptional efficiency between regions upstream and downstream of 
the HRE, in agreement with our in vitro abortive transcript results 
(Fig. 2a, b). 


Structure-dependent HRE-binding proteins 


Because the C9orf72 HRE can generate both full-length and abortive 
transcripts capable of forming RNA G-quadruplexes, we explored 
conformation-dependent protein interactions associated with the RNA 
repeat sequences. Biotinylated (GGGGCC), RNAs, in either G-quadruplex 
or hairpin conformations, and antisense (CCCCGG), hairpin RNAs 
were conjugated to streptavidin beads (Extended Data Fig. 1h). To com- 
prehensively and quantitatively identify the conformation-dependent 
ribonucleoprotein complexes formed by these HRE RNAs, we used 
stable isotope labelling by amino acids in cell culture (SILAC)*”° with 
HEK293T cells grown in medium containing normal, medium or heavy 
amino acids before performing the RNA pull-down as shown in Ex- 
tended Data Fig. 5. In brief, labelled cell lysates were incubated with 
RNA-conjugated streptavidin beads and the isolated ribonucleoprotein 
complexes were washed with increasing KCl concentrations to remove 
proteins that were loosely associated with the RNA or with other proteins. 
Then the final eluted fractions were analysed for complexes binding 
the (GGGGCC), G-quadruplex, (GGGGCC), hairpin, and (CCCCGG),4 
hairpin with the respective labels by mass spectrometry. The SILAC 
analysis provided a list of 288 proteins with quantification of their 
binding preferences (Supplementary Table 1) for the various RNA 
structures formed by the sense GGGGCC or the antisense CCCCGG 
sequences. 

To confirm the specificity of the major proteins in binding to the 
various RNA structures identified by SILAC, we performed western 
blot analysis on the increasingly stringent fractions from the RNA 
pull-down (Fig. 3a). NCL and heterogeneous nuclear ribonucleopro- 
tein (hnRNP) U preferentially recognize the RNA G-quadruplex motif 
(Fig. 3 and Extended Data Fig. 6). The proteins hnRNP F and RPL7, a 
heterogeneous nuclear ribonucleoprotein and a ribosomal protein, res- 
pectively, prefer guanine-rich RNA and bind both the G-quadruplex 
and the alternative hairpin of the GGGGCC repeat. The protein hnRNP K, 
an RNA-binding protein that binds cytosine-rich RNA, shows pref- 
erential binding to the antisense CCCCGG repeat. Differential KCI salt 
elution indicates that whereas the ribonucleoprotein complexes with 
hnRNP U or hnRNP F are more easily destabilized with increasing KCl, 
the interactions with NCL or RPL7 are more robust and survive high 
KCl concentrations. These results are in accordance with the SILAC 
quantitative analysis. To further demonstrate that NCL binds directly 
to the G-quadruplex motif, we used an in vitro RNA pull-down with 
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Figure 3 | Identification of conformation-dependent C9orf72 HRE RNA- 
binding proteins. a, Western blotting analyses of the sequential fractions 
eluted from a biotinylated-RNA pull-down of HEK293T cells with increasing 
KCl concentrations. The RNA-binding proteins identified using LC-MS were 
differentiated among those that recognize different structural motifs, including 
the RNA antisense (AS) hairpin, (CCCCGG),4; the RNA sense (S) hairpin 
(GGGGCC),; and the sense G-quadruplex (G). NCL and hnRNP U have 
binding preferences for the G motif of the sense RNA. hnRNP F and RPL7 bind 
both guanine-rich sense sequences, regardless of the underlying RNA structure. 
hnRNP K prefers binding to the cytosine-rich AS. b, A representative spectrum 
from SILAC analysis is shown for the preferential binding of NCL to the G 
motif formed by (GGGGCC), compared to the hairpin motifs formed by the 
sense or antisense sequences. c, An RNA pull-down performed with GST-NCL 
demonstrates that NCL directly binds (GGGGCC), RNA in vitro with highest 
affinity for the G-quadruplex. Data are means + s.d. n = 3. 


a recombinant glutathione S-transferase (GST)-NCL fusion protein. 
NCL was shown to preferentially pull down the (GGGGCC), RNA 
G-quadruplexes (Fig. 3c). These results establish NCL as a specific and 
enriched interactor that directly and preferentially binds the RNA 
G-quadruplexes formed by the C9orf72 HRE. 


The C9orf72 HRE causes nucleolar stress 

To investigate the consequences of the specific binding of NCL, a prin- 
cipal component of the nucleolus*' to G-quadruplex complexes formed 
by the C9orf72 HRE RNA, we examined phenotypic differences in the 
cellular localization of NCL in C9orf72 HRE patient cells and controls. 
In control B lymphocyte cells, endogenous NCL immunofluorescent 
staining is condensed in the nucleolus. In contrast, the nucleolus appears 
more fractured and NCL is more dispersed throughout the nucleus in 
patient cells carrying the C9orf72 HRE, as shown in heat maps indi- 
cating the NCL density (Fig. 4a). Quantification of NCL distribution in 
the nuclei of patient and control cells confirms a significant shift of 
NCL away from dense nucleoli to a more dispersed localization (Extended 
Data Fig. 7a). This NCL pathology is also observed in C9orf72 patient 
fibroblasts, but not in fibroblasts from non-ALS or non-C9orf72 ALS 
controls (Extended Data Fig. 7a). Next, we examined the NCL locali- 
zation pattern in induced pluripotent stem (iPS) motor neuron cells 
derived from patient fibroblasts”. As seen for the B lymphocytes, immu- 
nofluorescent staining for NCL in disease-relevant motor neurons 
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Figure 4 | Nucleolar stress is a result of repeat-containing RNA transcripts 
from the C9orf72 HRE. a, In the control C9o0rf72 WT B lymphocytes, 

NCL (green) is localized to the condensed nucleolus. In contrast, the cells of 
patients with the C9orf72 HRE show an increased NCL diffusion and fractured 
nucleoli in the nucleus (Hoechst, blue). A heat map of NCL intensities marks 
the difference between cells. b, iPS motor neurons derived from patients 
carrying the C9orf72 HRE also demonstrate NCL mislocalization. f-III tubulin 
(Tuj1) (red) was used to identify neurons. c, NCL colocalizes with RNA foci 
(red) formed in motor cortex tissue from patients carrying the C9orf72 HRE. 
A (CCCCGG), 5 probe was used to detect the (GGGGCC),, RNA foci, a 
previously identified pathological feature of the C9orf72 HRE tissues. 

d, Transfection of (GGGGCC),, abortive transcripts (Fig. 2a) recapitulates 
NCL pathological features observed in patient cells with the C9orf72 HRE. 


from patients with the C90rf72 HRE shows NCL dispersion that occu- 
pies a significantly increased area of the nucleus (Fig. 4b and Extended 
Data Fig. 7a). No obvious phenotypic differences were observed for 
hnRNP F, hnRNP U, or hnRNP K between C9orf72 HRE B lympho- 
cytes and those of controls (Extended Data Fig. 7d). 

To confirm the association of NCL with C9orf72 repeat RNA in 
pathologically relevant tissues of ALS/FTD patients with the C9orf72 
HRE, we performed RNA fluorescence in situ hybridization (FISH) in 
combination with NCL immunostaining in motor cortex tissues from 
these patients. The distribution of NCL in post-mortem tissues seems 
variable; however, it is evident that NCL frequently colocalizes with 
the GGGGCC RNA foci in the neurons of the motor cortex only in 
C9orf72 HRE ALS patients (Fig. 4c and Extended Data Fig. 7c). These 
results support the interaction of NCL and the GGGGCC repeat 
transcripts in vivo and indicate the possible occurrence of functional 
defects associated with nucleolar stress in ALS/FTD patients. 

Next, we examined whether the NCL mislocalization and nucleolar 
stress in the patient cells were caused by a gain in toxicity of the aberrant 
C9orf72 RNA. To address this question directly, we transfected HEK293T 
cells with the abortive transcripts generated in the aforementioned 
in vitro reactions. Treatment with the 21-repeat-containing abortive 
transcripts recapitulates the NCL pathology observed in patient cells 
carrying the C9o0rf72 HRE: NCL is significantly more dispersed in the 
nucleus (Fig. 4d and Extended Data Fig. 7e, f). Furthermore, the trans- 
fected HRE transcriptional products show a concentration-dependent 
decrease in cell viability compared to a control transcript (Extended 
Data Fig. 7g). Thus, our results establish a direct link between abortive 
C9orf72 HRE transcripts, cytotoxicity and patient pathology. 

The mislocalization of another nucleolar component, nucleophos- 
min (also known as B23), in B lymphocytes carrying the C90rf72 HRE 
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Figure 5 | A model for the molecular cascade resulting from the C9orf72 
HRE structural polymorphism. The DNA and RNA*DNA structures formed 
in the GGGGCC repeat region impede RNA polymerase transcription, which 
results in transcriptional pausing and abortion. This leads to a loss of full-length 
products and an accumulation of abortive transcripts. Abortive transcripts 
that contain the hexanucleotide repeats form G-quadruplexes and hairpins and 


further indicates general nucleolar stress (Extended Data Fig. 8a, b). 
To determine whether the functions of the nucleolus are impaired in 
patient cells and tissues, we used (RT-PCR to measure the processing 
and maturation of the 45S ribosomal RNA (Extended Data Fig. 8c). 
The results show a decrease in the processing of the precursor 45S 
rRNA into the mature cleaved 285, 18S and 5.8S rRNAs in C9orf72 
HRE patient B lymphocytes (Extended Data Fig. 8d). Furthermore, 
this rRNA maturation is significantly decreased in the motor cortex 
tissues from ALS patients carrying the C90rf72 HRE (Extended Data 
Fig. 8d). Taken together, these results identify a functional defect invol- 
ving the biochemical effects of the C9orf72 HRE in nucleolar stress and 
directly link this defect to disease pathology. 

Next, to examine other RNA-related functional defects that could 
arise in addition to the chronic nucleolar stress identified in C9orf72 
HRE patients, we measured the abundance of processing bodies (P bodies) 
in patient-derived motor neurons. P bodies, which are composed of 
ribonucleoprotein complexes involved in the degradation of untrans- 
lated mRNAs*, are significantly increased in number, but not size, in 
iPS motor neurons from patients carrying the C9orf72 HRE relative to 
controls (Extended Data Fig. 9a, b). This increase in P bodies is con- 
sistent with observations of decreased ribosomal maturation caused by 
nucleolar stress, which can then lead to increased untranslated mRNA 
in the cytoplasm and to a global perturbation in RNA processing. 

Finally, to test whether the cells of patients harbouring the C9orf72 
HRE are more sensitive to proteotoxic stress, we treated the iPS motor 
neurons with tunicamycin, which induces endoplasmic reticulum stress 
and unfolded protein response. C90rf72 HRE patient neurons show a 
dose-dependent increase in sensitivity to tunicamycin compared to 
controls (Extended Data Fig. 9c, d). These results are consistent with 
the perturbation in protein homeostasis, which is possibly linked to the 
nucleolar stress identified in ALS patients carrying the C9orf72 HRE. 


Discussion 

The biological activity of nucleic acids is determined not only by their 
linear sequence of nucleotides, but also by their structural diversity. 
Here we found that the generation of distinct polymorphic structures 
of the C90rf72 HRE DNA and RNA, such as G-quadruplexes and 
RNA*DNA hybrids (R-loops), underlie HRE-dependent molecular 
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bind essential proteins in a conformation-dependent manner. Sequestration 
of these proteins leads to nucleolar stress and other downstream defects. The 
repeat-containing transcripts can also escape the nucleus and be bound by 
ribosomal complexes, thereby increasing repeat-associated non-ATG- 
dependent translation that results in aggregative polydipeptides. 


events, including abortive transcription and sequestration of unique 
RNA-binding proteins. The identification of conformation-dependent 
ribonucleoprotein complexes and a specific nucleolar pathology pro- 
vides a cohesive mechanism for the disease (Fig. 5 and Supplementary 
Discussion): An impairment in transcriptional processivity by the HRE 
leads to accumulation of abortive transcripts, and these repeat-containing 
RNAs sequester proteins that recognize their distinctive conformations, 
sensitizing cells for chronic neurodegenerative damage. Our findings 
concerning NCL represent a direct link between the characteristic 
G-quadruplexes of the C9orf72 HRE and the resulting cascade of 
pathological defects in ALS/FTD patients. 

The nucleolus is a central hub in cellular stress responses*’, and 
NCL has been shown to have a critical role in the long-term main- 
tenance of mature neurons™. Our results demonstrate that motor 
neurons harbouring the C90rf72 HRE show defects in rRNA proces- 
sing and increased sensitivity to stresses related to protein homeosta- 
sis, suggesting that affected patient neurons are more vulnerable to 
age-dependent chronic stresses. Together, these results point to the 
targeting of these toxic nucleic acid conformations as a possible inter- 
vention at the root of the pathogenic cascade, and also suggest a mech- 
anistic model for similar repeat expansion neurodegenerative diseases 
that share nucleic acid features of the structurally polymorphic C9orf72 
HRE. 


METHODS SUMMARY 


CD analysis of nucleic acid structures was performed as described*’, EMSAs were 
performed as recommended***’, and DMS protection assays followed a prev- 
iously described protocol’***. The RNase protection assays were performed fol- 
lowing Ambion’s recommendations. Plasmids were constructed in a pCR8-TOPO 
vector (Invitrogen), and GGGGCC HRE inserts were generated using a self- 
templating PCR protocol”. In vitro transcription reactions were performed with 
these plasmids and analysed on sequencing gels. R-loop assays were adapted from 
previously described methods”. The complementary DNAs from B lymphocytes 
or RNAs from human tissues with or without the C9orf72 HRE were generated 
from total RNA following manufacturer’s protocols and relative levels were then 
measured. NanoString RNA analysis followed standard protocols as previously 
described”. The RNA pull-down with isotopically labelled HEK293T lysates and 
biotinylated RNA conjugated to streptavidin beads followed a previously described 
protocol”, with an additional KCl gradient wash. Quantitative mass spectrometry 
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was performed by using a three-state SILAC analysis using a filter-aided sample 
preparation (FASP) method followed by analysis on an LTQ-Orbitrap Elite mass 
spectrometer“. Peptides were identified using the Mascot search algorithm. 
Western blotting was performed on RNA pull-down fractions according to the 
manufacturer’s recommendations for each antibody. Immunofluorescent staining 
of lymphocytes, HEK293T cells, fibroblasts and iPS motor neurons followed a 
standard protocol described in detail in Methods. RNA FISH with immunofluor- 
escent on human motor cortex tissue was performed essentially as previously 
described”. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Vaccines prevent infectious disease largely by inducing protective neutralizing antibodies against vulnerable epitopes. 
Several major pathogens have resisted traditional vaccine development, although vulnerable epitopes targeted by neu- 
tralizing antibodies have been identified for several such cases. Hence, new vaccine design methods to induce epitope- 
specific neutralizing antibodies are needed. Here we show, with a neutralization epitope from respiratory syncytial virus, 
that computational protein design can generate small, thermally and conformationally stable protein scaffolds that accu- 
rately mimic the viral epitope structure and induce potent neutralizing antibodies. These scaffolds represent promising 
leads for the research and development of a human respiratory syncytial virus vaccine needed to protect infants, young 
children and the elderly. More generally, the results provide proof of principle for epitope-focused and scaffold-based 
vaccine design, and encourage the evaluation and further development of these strategies for a variety of other vaccine 
targets, including antigenically highly variable pathogens such as human immunodeficiency virus and influenza. 


Vaccination is a proven, safe and cost-effective way to protect against 
infectious disease’”, but potentially vaccine-preventable illnesses con- 
tinue to place a heavy burden on the human population. Data from 
recent epidemiological studies indicate that in 2010, infectious dis- 
eases caused 18.5% of all human deaths and 23% of disability-adjusted 
life years**. This burden could be reduced by broader deployment and 
use of existing vaccines or by other prevention modalities or treatment 
regimens. However, for maximal, affordable and sustainable gains in 
global health, new or improved vaccines are needed for several major 
pathogens including human immunodeficiency virus (HIV)-1 (ref. 5), 
malaria’, Mycobacterium tuberculosis’, influenza virus®, dengue virus” 
and respiratory syncytial virus (RSV)’°. One likely impediment to vac- 
cine development in these cases is the limited set of antigen design or 
presentation methods available to vaccine engineers. For example, cur- 
rent licensed vaccines in the United States'' derive from strategies that 
have been available for many years: viral vaccines are composed of recom- 
binant virus-like particles or live, live-attenuated or whole inactivated 
viruses or subunit vaccines, and bacterial vaccines are composed of 
bacterial surface proteins, detoxified toxins or polysaccharides with or 
without conjugation to a carrier protein. 

Epitope-focused vaccine design is a conceptually appealing but 
unproven method in which immunogens are designed to elicit pro- 
tective antibody responses against structural epitopes that are defined 
by protective antibodies isolated from infected patients or animal models’”. 
This strategy, if validated, could offer a potential route to vaccines for 
many pathogens that have resisted traditional vaccine development, 
including highly antigenically variable viruses such as HIV, influenza 
and hepatitis C virus, for which broadly neutralizing antibodies have 
been discovered and characterized structurally with their target epitopes”’. 


We tested the feasibility of this strategy using an epitope from RSV, a 
virus that causes lower respiratory tract infections in children and the 
elderly. In 2010 RSV was estimated to be responsible for 6.7% of all 
deaths in children of ages 1 month to 1 year’. We focused on the epi- 
tope targeted by the licensed, prophylactic neutralizing antibody pali- 
vizumab (also known as Synagis, pali) and an affinity-matured variant, 
motavizumab (mota)™*. A crystal structure of mota in complex with its 
epitope from the RSV Fusion (F) glycoprotein revealed that the anti- 
body-bound epitope attains a helix-turn-helix conformation”. 

We previously developed ‘side-chain grafting’ and “backbone graft- 
ing’ methods to transplant continuous or discontinuous epitopes to 
scaffold proteins of known structure, for epitope conformational sta- 
bilization and immune presentation’*”°. Epitope scaffold immunogens 
designed by these methods for epitopes from HIV or RSV (including 
the mota epitope) have in some cases induced structure-specific anti- 
bodies but have failed to induce neutralizing antibodies'*"*. Because 
these methods are restricted to scaffold proteins of predetermined 
structure, we have developed a new computational method to design 
scaffold proteins with full backbone flexibility, to allow greater pre- 
cision in tailoring scaffold structures for particular epitope structures. 
We used this method to design scaffolds for the mota epitope, and we 
found that the scaffolds had favourable biophysical and structural pro- 
perties and that scaffold immunization of rhesus macaques induced 
RSV-neutralizing activity (Fig. 1). 


Computational method 

Great strides have been made in developing de novo methods to design 
arbitrary, idealized protein structures*’””, but the resulting proteins 
have lacked functional activity. We devised a computational method 
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Figure 1 | Anew computational method to design epitope-focused vaccines, 
illustrated with a neutralization epitope from RSV. Stages of computational 


to allow de novo folding and design of scaffold proteins stabilizing func- 
tional motifs (Extended Data Fig. 1). This procedure, called Fold From 
Loops (FFL), has four stages: (1) selection of the functional motif and 
target topology to be folded around the motif; (2) ab initio folding to 
build diverse backbone conformations consistent with the target topo- 
logy; (3) iterative sequence design and structural relaxation to select 
low-energy amino-acid sequences for the given backbone conforma- 
tions; (4) filtering and human-guided optimization, in which the best 
designs are identified by structural metrics and then subjected to optional 
human-guided sequence design to correct remaining flaws. 


Design of epitope scaffolds 

To design scaffolds for the helix-turn-helix conformation of the mota 
epitope (PDB accession 3IXT, chain P), we selected a three-helix bundle 
(PDB 3LHP, chain S) as the template topology. Knowing that the tem- 
plate protein folds into thermally stable, soluble monomers'*”’, we 
designed scaffolds of similar length and position-dependent secondary 
structure. We produced 40,000 designs using FFL stages 1-3 and then 
used multiple structural filters to select eight designs for human-guided 
optimization. Additional modifications were made to those designs as 
follows: first, to optimize solubility, nearly all surface residues outside 
the epitope were replaced with those from the template protein; and 
second, to optimize side-chain packing in the buried protein core, com- 
putational design was used to design larger hydrophobic residues at 
selected buried positions of most designs (Extended Data Fig. 2). The 
final eight FFL designs had similar but non-identical backbone con- 
formations (pairwise root mean squared deviation (r.m.s.d.) ranging 
from 0.5 to 3.0 A) with correspondingly diverse core packing solutions 
differing from each other by 8 to 42 mutations and from the template 
by 56 mutations on average (Extended Data Fig. 2). All eight FFL 
designs had identical surface residues (including non-epitope residues 
taken from the template, as well as the epitope itself). To create fully 
artificial scaffolds with different antigenic surfaces that could be used 
in heterologous prime-boost regimens with FFL scaffolds or to map 
immune responses to FFL scaffolds, we resurfaced” the FFL_001 design; 
this produced the ‘FFL_surf designs (Extended Data Fig. 2) that dif- 
fered from FFL_001 by 36 mutations on average and had no significant 
sequence similarity (BLAST Evalue < 10~*) to any known protein 
except the RSV F protein. 


Biophysical and structural characterization 


Six out of eight FFL designs and three out of four FFL_surf designs 
could be expressed in Escherichia coli and purified, with yields ranging 
from 3 to5mgl~ 1 These nine scaffolds were monomeric in solution, 
showed circular dichroism spectra typical for properly folded helical 
proteins, and all but one were highly thermally stable with melting tem- 
peratures (T,,,) greater than 75 °C (Fig. 2a, b, Table 1 and Extended Data 
Fig. 3). 1SN heteronuclear single quantum coherence (HSQC) spectra 
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were collected for four FFL designs, and these data showed reasonable 
to good peak dispersion, typical of well-behaved, globular proteins with 
high o-helical content in solution (Fig. 2c, Table 1 and Extended Data 
Fig. 3). 

The nine purifiable FFL and FFL_surf scaffolds all had high affinity 
for mota, as assessed by surface plasmon resonance (SPR) (Kp < 800 pM, 
Fig. 2d, Table 1 and Extended Data Fig. 4). In particular, six out of nine 
scaffolds had very high mota affinities (Kp = 6-94 pM) and slow dis- 
sociation rates (ko ~ 10 *s- ’) comparable to those of the mota inter- 
action with the RSV F glycoprotein (Kp = 35 pM and kog = 0.31 

x 10° *s_')4. The mota-scaffold interaction was also specific—the 
point mutation K82E on scaffold FFL_001, analogous to the K272E 
mota escape mutation on RSV F”, reduced mota binding by more 
than a factor of 1,000 (Extended Data Fig. 4). These results suggested 
that the conformation of the native epitope was reproduced accurately 
on the scaffolds. The mota affinities for FFL scaffolds were three to 
four orders of magnitude higher than the mota affinities for the free 
epitope peptide (Kp = 210-240 nM"*) or for the best side-chain-grafting 
epitope scaffold previously reported (Kp = 90-125 nM"’). 

To evaluate the degree to which high-resolution structures of the 
FFL designs recapitulated the design models and the mota epitope, we 
solved two crystal structures: unliganded FFL_005 and the complex of 
FFL_001 bound to mota Fab (Fig. 2e, f), to resolutions of 2.0 and 2.7 A, 
respectively (Extended Data Fig. 5). The crystal structures showed good 
overall agreement with the design models—the backbone r.m.s.d. over 
all residues was 1.7 A for FFL_005 and 1.2 A for FFL_ 001 (Fig. 2e, f), 
and the all-atom r.m.s.d. for the core side chains was 2.5 A for FFL_005 
and 1.8A for FFL_001. Consistent with the biophysical data, both 
unliganded and mota-bound structures revealed a high degree of epi- 
tope mimicry. Compared to the structure of peptide in complex with 
mota (PDB 3IXT), the epitope backbone r.m.s.d. was 0.5 A for FFL_005 
(Fig. 2g) and 0.4 A for FFL_001. Compared to structures of pre- and 
post-fusion RSV F trimer (PDB 4JHW and 3RRR), which were not 
available at the time the designs were carried out, epitope backbone 
r.m.s.d. was 0.3 and 0.4 A for FFL_005, respectively. Furthermore, the 
interaction of FFL_001 with mota accurately recapitulated the inter- 
action of mota with peptide; superposition of the epitope and paratope 
of both complexes gave an all-atom r.m.s.d. of 0.8 A (Fig. 2h). 


Immunological evaluation 


To assess whether humans can make antibodies specific for the RSV 
epitope structure stabilized on the scaffolds, we tested the binding of 
sera from six RSV-seropositive humans to RSV F, FFL_001 and FFL_001 
variants with two different epitope mutations (N72Y and K82E) cor- 
responding to RSV escape mutations for pali (N262Y and K272E) and 
mota (K272E) (Fig. 2g and Extended Data Fig. 6). Although all sera 
reacted with RSV F and none reacted to the scaffold escape mutants, 
three sera displayed reactivity to FFL_001. These data confirmed that 
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Figure 2 | Biophysical and structural characterization of scaffold FFL_001. 
a, Size-exclusion chromatography coupled in-line with multi-angle light 
scattering measured a molecular weight in solution of ~15 kDa, corresponding 
to a monomer. LS, light scattering; UV, ultraviolet absorbance. b, Circular 
dichroism data fit with a two-state model showed that the protein had a melting 
temperature of 74 °C. Inset, the wavelength scan at 25 °C exhibited two minima 
characteristic of an all-helical protein. Deg, degrees; MRE, mean residue 
ellipticity. c, Two-dimensional 'H-'°N HSQC spectrum at 25 °C and 600 MHz 
showed good peak dispersion typical of well-folded, «-helical proteins. d, SPR 
data and model fits (red lines) of the interaction with mota Fab analyte, from 
which the dissociation constant (Kp) was measured to be 29.9 pM. Similar 
results were obtained with scaffold analyte and mota IgG ligand. RU, response 
units. e, Crystal structure of unliganded FFL_005 (blue, green and salmon 
helices, with yellow epitope), superimposed with the design model (grey with 
yellow epitope). f, Crystal structure of FFL_001 bound to mota Fab, 
superimposed with the design model. Colouring as in e, but with the Fab light 
and heavy chains in grey and purple, respectively. g, Superposition of the 
epitope structure from unliganded FFL_005 (yellow) and the complex of 
peptide (green) bound to mota (PDB 3IXT). FFL_005 is coloured salmon 
outside the epitope. The positions of escape mutations for pali (262 and 272, 
RSV numbering) or mota (272) are noted. h, Superposition of the 
mota-liganded structures of FFL_001 and peptide (PDB 3IXT). The antibody 
chains of 3IXT are coloured in wheat, and the interfacial side chains of both 
epitope and antibody are shown in stick representation. 


the FFL scaffolds presented a clinically relevant epitope conformation 
and illustrated that epitope scaffolds have promise as reagents to assay 
levels of epitope-structure-specific antibodies in polyclonal sera. 

We tested whether the FFL scaffolds could induce RSV-neutralizing 
antibodies by vaccination in both BALB/c mice and rhesus macaques. 
Four immunogens were tested: monomeric scaffolds FFL_001, FFL_005 
and FFL_007, and a virus-like particle consisting of hepatitis B core 
antigen (HBcAg) particles conjugated with multiple copies of FFL_001 
(refs 25, 26). Mice produced robust binding antibody responses against 
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the autologous antigens, but binding antibody responses against RSV F 
protein or RSV viral lysate were detected in only a few animals (Extended 
Data Fig. 6), and neutralizing activity as judged by a plaque reduction 
assay was not detected (not shown). In contrast to the mouse results, after 
three immunizations all macaques produced robust binding responses 
not only against the autologous antigens (Fig. 3a) but also recombinant 
RSV F protein (Extended Data Fig. 6), and most animals responded to 
RSV viral lysate (Fig. 3a and Supplementary Table 1). Neutralizing activ- 
ity was detected by the plaque assay in 7 out of 16 macaques after three 
immunizations and in 12 of 16 macaques after five immunizations 
(Figs 1 and 3a and Supplementary Table 2). Neutralizing activities were 
confirmed at selected time points using two different assays (micro- 
neutralization and a flow cytometry-based assay) in different labora- 
tories, and included measurement of neutralizing activity against RSV 
subtype B” as well as subtype A (Extended Data Figs 6, 7 and Supplemen- 
tary Table 2). To benchmark the neutralization potency, selected macaque 
sera were tested side by side with sera from seropositive human adults, 
in both the plaque reduction and flow cytometry assays (Fig. 3b, c). 
The results in both assays demonstrate that the best-responding maca- 
ques, including two out of four animals in the particle group at week 20 
and one animal in that group at week 12, have neutralization titres 
comparable to those induced by natural human infection. This is note- 
worthy given that natural infection exposes multiple epitopes on the RSV 
F and G glycoproteins, whereas the scaffolds exposed only one epitope. 


Monoclonal antibody characterization 


To study the molecular basis for the vaccine-induced neutralizing activ- 
ity, we used single-B-cell sorting to isolate epitope-specific monoclonal 
antibodies” from memory B cells of one animal from the particle group 
with potent serum neutralizing activity. We isolated B cells that bound 
strongly to FFL_001 but not to a double mutant of FFL_001 (FFL_001 
_N72Y_K82E) containing both pali escape mutations. Following DNA 
sequencing of antibody variable genes in those cells, we produced 11 
recombinant monoclonal antibodies, of which eight bound with high 
avidity to FFL_001 and two (17-HD9 and 31-HG7) bound with high 
avidity to RSV F protein (Fig. 4a and Extended Data Fig. 8). SPR revealed 
that these two monoclonal antibodies, which are clonal relatives, have 
extremely high affinities (Kp ~ 3 pM) for the scaffold FFL_001 that 
elicited them when mounted on the particle (Extended Data Fig. 8). 
Concomitant with high affinities, these two monoclonal antibodies have 
neutralization potencies similar to mota and higher than pali by nearly 
an order of magnitude (Fig. 4b and Extended Data Fig. 8). 

To map the epitopes for 17-HD9 and 31-HG7, we assessed binding 
to several scaffold variants (Extended Data Fig. 9). Both monoclonal 
antibodies: (1) bound with very high affinity (Kp = 40-50 pM) to FFL 

_001_surfl1, which has an antigenically distinct surface from FFL_001 
outside the RSV epitope; (2) retained high affinity (Kp = 180-330 pM) 
for the FFL_001_K82E mota escape mutant; (3) retained modest affinity 
(Kp = 60-140 nM) for the FFL_001_N72Y_K82E double escape mutant; 
and (4) lacked detectable affinity for FFL_MPV_001, which swaps RSV 
residues on FFL_001 to those at the analogous positions on human 
metapneumovirus, which has a similar helix-turn-helix conformation 
(r.m.s.d. 0.9 A, PDB 4DAG”) but very different amino acid sequence. 
These results indicate that the two macaque monoclonal antibodies 
target the same helix-turn-helix epitope as mota and pali but have dif- 
ferent fine specificities. 

To understand the structural basis for the binding and neutralizing 
potency of these macaque monoclonal antibodies, we pursued crystal- 
lography of 17-HD9 and 31-HG7 complexes with FFL_001. We obtained 
crystals of the 31-HG7-FFL_001 complex that diffracted to 3.8 A, which 
was sufficient to determine a molecular replacement solution using the 
FFL_001 crystal structure and a composite Fab model, but insufficient 
to perform detailed rebuilding and refinement. The molecular replace- 
ment solution allowed determination of the rigid-body orientation of 
31-HG7 relative to FFL_001 and demonstrated that 31-HG7 approaches 
the helix-turn-helix from a different angle than mota (angle difference 
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Table 1 | Biophysical properties of scaffolds and scaffold-antibody interactions 


Molecule Multimeric Tei AG SPR NMR-HSQC 
state (@C) (kcal mol~?) motavizumab dispersion 
Kon Kort kott/Kon 
(Mots) (s+) (pM) 
FFL_OO1 Mon 76 ND 3.99 x 10° 1.19 x 10-4 29.9 Dispersed 
FFL_002 Mon 49 ND 1.56 x 10° 7.34 x 10-4 469.9 ND 
FFL_004 Mon >85 ND 1.05 x 10° 8.32 x 104 795.0 ND 
FFL.005 Mon >100 15.0 2.97 x 10° 2.09 x 10°* 70.3 Partially dispersed 
FFL_O06 Mon >85 ND 3.57 x 10° 2.32 x 10-4 651.9 Dispersed 
FFL_OO7 Mon >85 ND 1.45 x 10° 1.36 x 104 94.1 Partially dispersed 
FFL_OO1_surf1 Mon 84 8.2 743 x 10° 4.70 x 10-4 6.32 ND 
FFL_0O1_surf2 Mon >85 8.1 5.32 x 10° 1.58 x 10-4 29.6 ND 
FFL_00O1_surf4 Mon >85 9.0 4.80 x 10® 1.58 x 10-4 32.9 ND 


Mon, monomer; ND, not done. 


~56°, Fig. 4c). We also obtained crystals and determined the structure 
of the 17-HD9-FFL_001 complex (resolution = 2.5 A), which contained 
four complexes of 17-HD9 bound to a 35-residue helix-turn-helix pep- 
tide (scaffold substructure) in the asymmetric unit (Fig. 4c and Extended 
Data Fig. 10). The 17-HD9 complex structures demonstrated that 17- 
HD9 recognizes essentially the same helix-turn-helix epitope as mota 
and pali—the conformation of the epitope in the 17-HD9 complexes 
is very similar to that in the structures of mota~FFL_001 (r.m.s.d. 0.5- 
0.7 A), RSV F pre-fusion (r.m.s.d. 0.3-0.4 A) and RSV E post-fusion 
(r.m.s.d. 0.5 A), and 85% of the epitope residues buried by either mota 
or 17-HD9 are also buried by the other (Fig. 4d and Supplementary 
Table 3). Although 17-HD9 and mota bury a similar amount of area 
on the epitope (690 A” versus 683 A”), 17-HD9 uses a different paratope 
to make more hydrogen bonds (15-18 versus 7) that plausibly contribute 
to its higher scaffold affinity and higher neutralization potency (Fig. 4e 
and Supplementary Tables 4, 5). The 17-HD9 complexes are also consis- 
tent with the ability of 17-HD9 to bind to the K82E mota escape mutant: 
density for the K82 side chain is absent in two out of four 17-HD9 com- 
plexes, and K82 is only 37% buried by 17-HD9 in the other two com- 
plexes (Fig. 4e); by contrast, K82 is 65% buried by mota and makes a 
buried salt bridge to mota light-chain residue D50. Taken together, these 
results demonstrate that epitope scaffold immunization can ‘re-elicit’ 


neutralizing antibodies that target with high precision an epitope pre- 
defined by a protective antibody. 


Discussion 


We have demonstrated that small, thermally and conformationally stable 
protein scaffolds that accurately mimic the structure of a viral neutral- 
ization epitope can induce neutralizing activity in a majority of vacci- 
nated macaques. The results establish the feasibility of epitope-focused 
and scaffold-based vaccine design, and encourage the application of 
these strategies for a variety of vaccine targets. The biophysical, struc- 
tural and functional data on the mota scaffolds validate the computa- 
tional design method (FFL), and support its continued development 
and application to other vaccine epitopes and other types of functional 
sites. Indeed, the data should encourage the general use of methods 
using protein backbone flexibility to design novel functional proteins. 

The scaffolds themselves represent promising leads for RSV vac- 
cine research and development (particularly the scaffolds presented 
on virus-like particles). Non-replicating RSV vaccine candidates are 
not tested in RSV naive young infants, the highest priority target popu- 
lation, owing to vaccine-mediated disease enhancement in early clinical 
trials of formalin-inactivated RSV". Scaffold immunogens that focus 
antibody responses to a known protective epitope but are otherwise 
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Figure 3 | Serological analysis of immunized macaques. a, ELISA end point 
titres measured against the autologous immunogen (left) or against RSV 
whole viral lysate (middle), and 50% neutralization titres as determined by the 
plaque reduction assay (right). The immunization groups are shown on the far 
left, and the schedule is indicated at the bottom. Small symbols connected 
with dashed lines indicate individual animals. Large symbols connected with 
solid lines report group averages, with error bars showing standard deviations, 
measured over the four animals in each group at each time point. 

b, Comparison of 50% neutralization titres for sera from six RSV-seropositive 
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humans and sera from eight macaques from weeks 12 and 20, measured side 
by side in the plaque reduction assay. Mean + standard deviation for the 
human data is 218 + 145. Two macaque data points at both week 12 and 
week 20 are not visible in the graph because no neutralizing activity was 
detected. c, Comparison of 50% neutralization titres for sera from 20 
RSV-seropositive humans and sera from five macaques from week 20, 
measured side by side in the flow cytometry assay. Mean + standard deviation 
for the human data is 462 + 792. 
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Figure 4 | Analysis of monoclonal antibodies isolated from an immunized 
macaque. a, Enzyme-linked immunosorbent assay (ELISA) binding of the 
macaque monoclonal antibodies (17-HD9 and 31-HG7) and pali to RSV F. 
b, Neutralization of RSV by the macaque monoclonal antibodies and pali, 
measured by a microneutralization assay. The half-maximum inhibitory 
concentrations (ICs) for pali, 17-HD9 and 31-HG7 were 0.08, 0.005 and 
0.007 pg ml 2 respectively. c, Molecular replacement model of 31-HG7 bound 
to FFL_001 (left), a crystal structure of 17-HD9 bound to a 35-residue 
helix-turn-helix peptide from FFL_001 (middle) and the crystal structure of 
mota (PDB: 3IXT) bound to peptide. The three structures are aligned with 
respect to the helix-turn-helix epitope. d, Structural alignment of the 
helix-turn-helix epitopes bound to mota (blue) and 17-HD9 (white), in which 
side chains are coloured orange if at least 15% of the total area (backbone 
plus side chain) of that residue is buried by the respective antibody. Nine 
positions are buried by both antibodies, two positions in the turn are buried 
only by 17-HD9 (P265 and T267, RSV numbering), and two positions near the 
peptide termini are buried only by mota ($255 and N276). e, Close-up view of 
the interface between 17-HD9 and helix-turn-helix epitope. Interaction 
residues are shown in stick, and the complementary determining region H3 
(CDRH3) is coloured red. K82/K272 (scaffold numbering/RSV numbering), 
at the edge of the interface, is coloured grey. 


unrelated to RSV may have a lower safety barrier to testing in human 
infants than other non-replicating RSV vaccine approaches. The fre- 
quency, magnitude and durability of neutralizing responses to these 
scaffolds remain to be optimized, by varying such parameters as adju- 
vant, dose, schedule, particle display system and mode of delivery. In 
the context of vaccinating a diverse range of humans (or nonhuman 
primates) with different, time-dependent B-cell repertoires, neutral- 
izing antibody responses to a single epitope may be more variable than 
responses to whole antigens or pathogens containing multiple neut- 
ralization epitopes. (Indeed, our discordant results in mice compared 
to macaques may reflect differences in species-dependent repertoires.) 
Thus, epitope-focused or scaffold-based vaccines for RSV or other path- 
ogens may also be improved by inclusion of more than one epitope. 
Features of the very potent neutralizing monoclonal antibodies iso- 
lated from one vaccinated macaque offer implications for vaccine design. 
These monoclonal antibodies had unusually high affinity for the elic- 
iting antigen, for example when compared to monoclonal antibodies 
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isolated from macaques that had been immunized with different, more 
conformationally labile antigens using similar regimens”. This suggests 
that rigid epitope structures may more efficiently induce extremely high- 
affinity antibodies, a possibility that merits further investigation. In cases 
of antigenically highly variable pathogens such as HIV, influenza or 
hepatitis C virus, the vaccine challenge is to induce responses to con- 
served but immunorecessive epitopes instead of the strain-specific 
epitopes that dominate the response to native antigens. Such conserved 
epitopes—the sites of vulnerability targeted by broadly neutralizing 
antibodies—are typically in close physical proximity to variable resi- 
dues, making precision of immuno-focusing a vaccine requirement. 
Our crystallographic finding that scaffold-elicited monoclonal anti- 
bodies recapitulate the mota neutralization specificity with high pre- 
cision provides proof of principle that epitope-focused vaccine design 
can meet this immuno-focusing challenge. 


METHODS SUMMARY 


Details of the FFL computational design protocol are provided in Methods. Pro- 
tocols for protein expression and purification, biophysical characterization, virus- 
like particle preparation, X-ray crystallography, NMR, animal immunization, 
enzyme-linked immunosorbent assays, neutralization assays and monoclonal iso- 
lation are provided in Methods and Supplementary Information. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Reflection from the strong gravity regime in a lensed 
quasar at redshift z = 0.658 


R. C. Reis', M. T. Reynolds’, J. M. Miller’ & D. J. Walton? 


The co-evolution of a supermassive black hole with its host galaxy’ 
through cosmic time is encoded in its spin?*. At z > 2, supermas- 
sive black holes are thought to grow mostly by merger-driven accre- 
tion leading to high spin. It is not known, however, whether below 
z= 1 these black holes continue to grow by coherent accretion or in 
a chaotic manner’, though clear differences are predicted** in their 
spin evolution. An established method® of measuring the spin of 
black holes is through the study of relativistic reflection features’ 
from the inner accretion disk. Owing to their greater distances from 
Earth, there has hitherto been no significant detection of relativistic 
reflection features in a moderate-redshift quasar. Here we report an 
analysis of archival X-ray data together with a deep observation of a 
gravitationally lensed quasar at z = 0.658. The emission originates 
within three or fewer gravitational radii from the black hole, imply- 
ing a spin parameter (a measure of how fast the black hole is rotat- 
ing) of a= 0.877) rei at the 3a. confidence level and a > 0.66 at the 5a 
level. The high spin found here is indicative of growth by coherent 
accretion for this black hole, and suggests that black-hole growth at 
0.5 = z= 1 occurs principally by coherent rather than chaotic accre- 
tion episodes. 

When optically thick material, for example, an accretion disk, is 
irradiated by hard X-rays, some of the flux is reprocessed into an addi- 
tional ‘reflected’ emission component, which contains both continuum 
emission and atomic features. The most prominent signature of reflection 
from the inner accretion disk is typically the relativistic iron Ka line 
(6.4-6.97 keV; rest frame)* and the Compton reflection hump’, often 
peaking at 20-30keV (rest frame). However, the deep gravitational 
potential and strong Doppler shifts associated with regions around 
black holes will also cause the forest of soft X-ray emission lines in 
the approximately 0.7-2.0-keV range to be blended into a smooth emis- 
sion feature, providing a natural explanation for the ‘soft excess’ observed 
in the X-ray spectra of nearby active galactic nuclei (AGNs)*”°. Indeed, 
both the iron (Fe) line and the soft excess can be used to provide insight 
into the nature of the central black hole and to measure its spin’. 
Previous studies have revealed the presence of a soft excess in over 
90% of quasars at'’'? around z < 1.7, and broad Fe-lines are also seen 
in $25% of these objects'"”’, suggesting that reflection is also preval- 
ent in these distant AGNs. However, owing to the inadequate signal- 
to-noise ratio resulting from their greater distances, the X-ray spectra 
of these quasars were necessarily modelled using simple phenomeno- 
logical parameterizations'*™. 

Gravitational lensing offers a rare opportunity to study the inner- 
most relativistic region in distant quasars’*"°; the ‘lens’ galaxy acts as a 
natural telescope, magnifying the light from the faraway quasar. Quasars 
located at 0.5 S z < 1 are considerably more powerful than local Seyfert 
galaxies (relatively nearby active galaxies) and are known to be a major 
contributor to the cosmic X-ray background”, making them objects 
of particular cosmological importance. 1RXS J113151.6— 123158 (here- 
after RX J1131—1231) is a quadruply imaged quasar at redshift z = 0.658 
hosting a supermassive black hole (My, ~ 2 X 108M, where Mo is 
the mass of the Sun)’® gravitationally lensed by an elliptical galaxy at 
z= 0.295 (ref. 19). X-ray and optical observations exhibited an intriguing 


flux-ratio variability between the lensed images, which was subsequently 
revealed to be due to significant gravitational micro-lensing by stars in 
the lensing galaxy’**"*. 

Taking advantage of gravitational micro-lensing techniques, aug- 
mented by substantial monitoring with the Chandra X-ray observatory, 
a tight limit of the order of about ten gravitational radii (rz = GM/c’, 
where G is the gravitational constant, M is the mass of the black hole 
and cis the speed of light in vacuum) was placed” on the maximum size 
of the X-ray emitting region in RXJ1131—1231, indicative of a highly 
compact”! source of emission (less than about three billion kilometres, or 
20 astronomical units, AU). The lensed nature of this quasar provides an 
excellent opportunity to study the innermost regions around a black 
hole at a cosmological distance (the look-back time for RX J1131—1231 
is about 6.1 billion years), and with this aim Chandra and XMM-Newton 
have accumulated nearly 500,000 s of data on RXJ1131—1231. 

Starting with the Chandra data, fits with a model consisting of a simple 
absorbed power-law continuum, both to the data from individual lensed 
images (Extended Data Fig. 1) and to the co-added data (Extended 
Data Figs 2, 3 and 4), reveal broad residual emission features at low 
energies (<2-keV rest frame, the soft excess) and also around the Fe K 
energies (3.5-7-keV rest frame), which are characteristic signatures of 
relativistic disk reflection”. To treat these residuals, we consider two 
template models based on those commonly used to fit the spectra of 
Seyfert galaxies*”* and stellar-mass black-hole binaries”, and which 
have also at times been used to model local quasars”. The first is a simple 
phenomenological combination of a power-law, a soft-thermal disk 
anda relativistic Fe-line component (‘baseline-simple’), and the second 
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Figure 1| Chandra image of RX J1131—1231. This representative image ofa 
single epoch was made using subpixel techniques in the 0.3-8-keV energy range 
(Supplementary Information) and is shown here smoothed with a Gaussian 
(o = 0.25"). The green circles show the source extraction regions. For images 
A-C, we used a radius of 0.492”, whereas the source region for image D was set 
to 0.984”. Individual source and background regions were made for all 30 
observations and spectra were extracted from the unsmoothed images. The 
colour (logarithmic) scale reflects the number of counts in each particular pixel 
ranging from 0 to a maximum of 212 counts. 
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employs self-consistent blurred-reflection model together with a power 
law (‘baseline-reflection’). In addition, both models include two neutral 
absorbers; the first to account for possible intrinsic absorption at the 
redshift of the quasar; and the second to account for Galactic absorption. 
We first statistically confirm the presence of reflection features in 
RXJ1131-1231 using the baseline-simple model. Least-squares fits were 
made to all the individual Chandra spectra of image B (the lens galaxy 
creates four images, which we label A-D; see Fig. 1) simultaneously, 
allowing only the normalizations of the various components and the 
power-law indices to vary (Extended Data Fig. 1). The thermal com- 
ponent, used here as a proxy for the soft excess, is required at > 10o and 
a Fisher’s F-test indicates that the addition ofa relativistic emission line 
to the combined Chandra data of image-B is significant at greater than 
the 99.9% confidence level. Tighter constraints (>5c) on the signifi- 
cance of the relativistic Fe line can be obtained by co-adding all Chandra 
data to form a single, time-averaged spectrum representative of the 
average behaviour of the system (Fig. 2 and Extended Data Fig. 3). 
The XMM-Newton observation also shows the clear presence of a 
soft excess below about 1.2 keV, again significant at >5c, and thanks to 
its high effective area above approximately 5 keV, it also displays the 
presence of a hardening (more high-energy photons) at high energies 
(Extended Data Fig. 6). An F-test indicates that a break in the power 
lawat around 5 keV (Fig. 2) is significant at the 3.60 level of confidence. 
This hardening is consistent with the expectation of a reflection spec- 
trum and can be characterized with the Compton reflection hump’. 
The unprecedented data quality for this moderate-z quasar (about 
100,000 counts in the 0.3-8 keV energy range from each of the Chandra 
and XMM-Newton data sets) enables us to apply physically motivated, 
self-consistent models for the reflection features. We proceed by using 
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Figure 2 | Broad Fe line, soft and hard excess in RX J1131—1231. The figure 
shows the co-added Chandra data over all epochs for all four lensed images. 
The blue line is a guide to the eye of a perfect model fit to the data. The data was 
fitted in a phenomenological manner with a model consisting of an absorbed 
power law with an index of J’= 1.60 + 0.04 for the continuum, a thermal 
disk component with a temperature of 0.19 + 0.02 keV to account for the soft 
excess, and a broad relativistic line with energy constrained to lie between 
6.4keV and 6.97 keV (rest-frame is the baseline-simple model). The ratio 

(of the data to the model) is shown after setting the normalization of the disk, 
the relativistic line and the narrow line component to zero, in order to highlight 
these features. The inset shows the XMM-Newton data fitted with a J’ = 1.83 
power law. The best-fitting, phenomenological model for the XMM-Newton 
data requires the presence of a soft excess, which can again be characterized 
by a thermal disk component with a temperature of 0.22 + 0.03 keV, 

a power law with an index = 1.839% up to a break at Epreak =5.5193 keV, 
at which point it hardens to 7 = 1.28*$:?3. This hardening is interpreted as 
the Compton reflection hump. Both co-added spectra shown here probe the 
time-averaged behaviour of RXJ1131-1231. Quoted errors refer to the 90% 
confidence limit and the error bars are lo. 
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the baseline-reflection model to estimate the spin parameter through a 
variety of analyses, including time-resolved and time-averaged ana- 
lyses of individual Chandra images, using its superior angular resolu- 
tion, and through analysis of the average spectrum obtained from all 
four lensed images with XMM-Newton. During the time-resolved 
analysis, the black-hole spin parameter as well as the disk inclination 
and emissivity profile were kept constant from epoch to epoch, whereas 
the normalizations of the reflection and power-law components, as 
well as the ionization state of the disk and the power-law indices, were 
allowed to vary between epochs (see the Supplementary Information 
for further details). In all cases, we obtain consistent estimates for the 
black-hole spin, which imply that RXJ1131—1231 hosts a rapidly 
rotating black hole (Extended Data Fig. 5 and Extended Data Tables 1 
and 2). Finally, to optimize the signal-to-noise ratio and obtain the best 
estimate of the spin parameter we fit the combined Chandra and 
XMM-Newton data of RX J1131-1231 simultaneously with the base- 
line-reflection model and find a=0.877$98 Jc/GM? (where J is the 
angular momentum) at the 3¢ level of confidence (and a> 0.66 Jc/ 
GM? at the 5a confidence level; Fig. 3). 

The tight constraint on the spin of the black hole in this gravita- 
tionally lensed quasar represents a robust measurement of black-hole 
spin beyond our local Universe. The compact nature of the X-ray 
corona returned by the relativistic reflection model used herein con- 
firms the prior micro-lensing analysis'*’°, and hence moves the basic 
picture of X-ray emission in quasars away from large X-ray coronae”® 
that may blanket at least the inner disk, and more towards a compact 
emitting region in the very innermost parts of the accretion flow, 
consistent with models for the base of a jet”. 

In addition to constraining the immediate environment and spin of 
the black hole, the analysis presented here has implications for the nature 
of the cosmic X-ray background. The best-fitting baseline-reflection 
model to the time-averaged Chandra and XMM-Newton spectra (Extended 
Data Figs 3 and 6) suggest that the source is at times reflection-dominated, 
that is, we find the ratio of the reflected to the illuminating continuum 
in the Chandra and XMM-Newton data to be freftect/fitlum = 2.3 = 1.2 
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Figure 3 | Goodness of fit versus the spin parameter of the supermassive 
black hole in RX J1131—1231. Fits were made with the spin parameter a 
varying from 0.495 to 0.995 in steps of 0.02 with all parameters of the model 
allowed to vary. The contour was generated by adopting a model consisting of a 
power law together with a relativistic blurred reflection by an accretion disk, 
as well as two neutral absorbers: one at the redshift of the quasar and another 
local to our Galaxy (the baseline-reflection model). The fit was made to the 
co-added spectra from both XMM-Newton and Chandra simultaneously, with 
the assumption that the spin of the black hole, the inclination of the accretion 
disk and the total hydrogen in our line of sight does not change between 
observations. The dotted lines show the 99.99%, 99.73% (3) and 90% 
confidence limits where it becomes clear that the supermassive black hole 

in RXJ1131—1231 must be rotating with a spin a=0.871'12 Jc/GM° at the 
30 level of confidence. 
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and 0.47 + 0.15, respectively, in the 0.1-10-keV band (local frame; 
Extended Data Table 2). However, it must be noted that uncertainties 
in the size of the microlensed regions could affect the absolute value of 
this ratio. Nonetheless, this analysis clearly demonstrates the presence 
of a significant contribution from a reflection component to the X-ray 
spectrum of this z = 0.658 quasar. The properties of RXJ1131—1231 
are consistent'''*”° with the known observational characteristics of 
quasars at 0.5 < z < 1, and our results suggest that the relativistic 
reflection component from the large population of unobscured quasars 
expected in this epoch” could significantly contribute in the 20-30-keV 
band of the cosmic X-ray background. 

Although questions have previously been raised over whether reflection 
is a unique interpretation for the features observed in AGNs, the 
amassed evidence points towards this theoretical framework”*”*, and 
reached culmination with the launch of NuSTAR (the Nuclear Spec- 
troscopic Telescope Array is an Explorer mission orbiting Earth that 
will allow us to study the Universe with high energy X-rays) in June 
2012 and the strong confirmation of relativistic disk reflection from a 
rapidly spinning supermassive black hole at the centre of the nearby 
galaxy NGC 1365 (ref. 6). Nonetheless, there still remain possible sys- 
tematic uncertainties, for example, owing to the intrinsic assumption 
that the disk truncates at the innermost stable circular orbit. Simulations 
have been performed that are specifically aimed at addressing the 
robustness of this assumption”’, which found that emission within this 
radius is negligible, especially for rapidly rotating black holes, as is the 
case here. 

The ability to measure cosmological black-hole spin brings with it 
the potential to study directly the co-evolution of the black hole and its 
host galaxy’. The ultimate goal is to measure the spin in a sample of 
quasars as a function of redshift and to use the spin distribution as a 
window on the history of the co-evolution of black holes and galaxies*. 
Our measurement of the spin in RXJ1131—1231 is a step along that 
path, and introduces a possible way to begin assembling a sample of 
supermassive black-hole spins at moderate redshift with current X-ray 
observatories. 


METHODS SUMMARY 


We produced images for all 30 individual Chandra pointings (Fig. 1; see Methods 
for details), and spectra were extracted over the 0.3-8.0-keV energy band for each 
of the four lensed images in each observation (all energies are quoted in the 
observed frame unless stated otherwise). Previous studies’® have demonstrated 
that certain lensed images/epochs might suffer from a moderate level of pile-up”°. 
We therefore exclude spectra that displayed any significant level of pile-up in all 
further analysis (see Methods for details and Extended Data Figs 7 and 8). The 
remaining spectra sample a period of approximately 8 years, which allows for both 
a time-resolved and time-averaged analysis of RX J1131-1231. We also analyse a 
deep XMM-Newton observation taken in July 2013, which provides an average 
spectrum of the four lensed images over the 0.3-10.0-keV energy range. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
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Interrogating selectivity in catalysis using molecular 


vibrations 


Anat Milo!, Elizabeth N. Bess' & Matthew S. Sigman! 


The delineation of molecular properties that underlie reactivity 
and selectivity is at the core of physical organic chemistry’*, and 
this knowledge can be used to inform the design of improved syn- 
thetic methods or identify new chemical transformations®°. For 
this reason, the mathematical representation of properties affect- 
ing reactivity and selectivity trends, that is, molecular parameters, 
is paramount. Correlations produced by equating these molecular 
parameters with experimental outcomes are often defined as free- 
energy relationships and can be used to evaluate the origin of selec- 
tivity and to generate new, experimentally testable hypotheses®’” ”. 
The premise behind successful correlations of this type is that a sys- 
tematically perturbed molecular property affects a transition-state 
interaction between the catalyst, substrate and any reaction compo- 
nents involved in the determination of selectivity'®"’. Classic phys- 
ical organic molecular descriptors, such as Hammett*, Taft* or 
Charton* parameters, seek to independently probe isolated electronic 
or steric effects**'*"*. However, these parameters cannot address 
simultaneous, non-additive variations to more than one molecular 
property, which limits their utility. Here we report a parameter sys- 
tem based on the vibrational response of a molecule to infrared 
radiation that can be used to mathematically model and predict se- 
lectivity trends for reactions with interlinked steric and electronic 
effects at positions of interest. The disclosed parameter system is 
mechanistically derived and should find broad use in the study of 
chemical and biological systems. 

A molecule’s structural features are embodied in its unique vibration 
modes, which invoke core, inherent bond force constants and atomic 
masses”’*'°, Thus, to interrogate intricate selectivity trends, we turned 
to vibrational energies. Kinetic isotope effects, a common reaction probe 
in biology and chemistry, further highlight the usefulness of relative 
vibrational energies for assessing mechanistic hypotheses””” (Fig. 1b). 
With this underpinning, we postulated that infrared vibrations could 
serve as mechanistically meaningful molecular descriptors in the study 
of catalytic reaction selectivity, allowing for correlations akin to free- 
energy relationships’ (Fig. 1c; see Methods Summary for details). 

Here we describe three case studies that substantiate the prospect of 
using molecular vibrations to predict and elucidate selectivity trends. 
We first consider the desymmetrization of bisphenols. In our recent 
reports of modelling catalytic systems, the catalyst or substrate librar- 
ies, or both, were specifically designed to avoid the complexity of inte- 
grated steric and electronic effects”'*’*. As an example, we studied the 
role of substrate steric effects in the peptide-catalysed desymmetriza- 
tion of bisphenols'*”° (Fig. 2a). A specific finding was that Verloop’s 
Sterimol parameters B, (minimal radius) and L (length, Fig. 1a) could 
be used to successfully correlate enantioselectivity to the steric impact 
of the substituent R on the bisphenol’. Because this study was directed 
towards evaluation of steric effects, obvious electronic changes were 
initially avoided. 

To determine whether the perturbation to enantioselectivity in this 
reaction is purely steric in origin, we evaluated a substrate containing 
-CC1;, which is a -CMe; (where Me is methyl) homologue according 
to its Sterimol values. The -CCl,-containing substrate yielded a much 


lower enantioselectivity than would be expected from purely steric 
considerations, suggesting that substituent electronic effects may affect 
the enantioselective outcome. To explore this outlier in the steric trend 
further, a model was developed on the basis of Sterimol values from 
an eight-membered training set of the original substrates (Fig. 2b). 
External validation for this model using four of the original sterically 
perturbed substrates and six new substrates containing concurrent 
steric and electronic modulation reveals a poor correlation (slope, 0.82; 
intercept, 0.18; R” = 0.60). The poorest performers were substrates 
with multifaceted structural features that cannot be accounted for by 
the parameter selection. The steric model’s shortcomings provided 
an impetus to explore infrared vibrations as a reaction interrogation 
technique. 

Thus, we chose mechanistically relevant infrared frequencies to pro- 
duce molecular descriptors for the prediction of enantioselectivity’®. 
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Figure 1 | Approaches to interrogating reaction mechanisms. a, Parameters 
used for free-energy correlations: Hammett electronic parameters based on 
the dissociation constant of substituted benzoic acids (top). Steric Sterimol 
parameters: substituent length (L), and minimal (B,) and maximal (Bs) widths 
perpendicular to the length (bottom). b, Energetic considerations in free- 
rey relationships (top) and in isotopologue vibrational energies (bottom). 
AAG is the difference in Gibbs free energy between the two selectivity- 
determining transition states, E, is the activation energy barrier and ZPE 

is the zero-point energy. c, Simulated infrared spectrum of benzoic acid 


and functional group spectral ranges. 
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Figure 2 | Using infrared vibrations and Sterimol values to correlate 
enantioselectivity. a, Reaction scheme for the desymmetrization of bisphenols. 
Reactant torsion is proposed to have a role in the mechanism. b, Correlation 
between Sterimol values and enantioselectivity (normalized model), 
including the training set, previous sterically modulated validations, 

and new simultaneously sterically and electronically modulated validations. 
c, Correlation between Sterimol B, (minimal width), vibrations and 
enantioselectivity (normalized model). d, Illustration of the vibrational 
frequencies used for the correlation of enantioselectivity: v,, antisymmetric ring 
stretch with secondary C-H bends and a C-O stretch; v2, symmetric ring 
stretch with a secondary O-H bend; v3, antisymmetric ring stretch with 
secondary C-C bends and a C-O stretch. e, Parameter values for the steric 
model outliers (CCl;, F;Ph and p-t-BuPh) and for two sterically homologous 
but electronically divergent substituents (CMe; and Ph). 


Because the steric and electronic features of R modulate enantioselec- 
tivity, we speculated that stretches of the bisphenol ring would be sen- 
sitive to enantioselectivity trends. These stretches are influenced by the 
mass and charge of R, incorporate various secondary C-H and C-C 
bends and affect the O-H group. Thus, the frequencies selected for 
modelling were six computationally derived, distinct sp” C=C stretches 
in the 1,700-1,500cm ! spectral region, which involve either one or 
both rings*”** (Methods and Supplementary Information). The most 
predictive, statistically significant model developed (slope, 0.99; inter- 
cept, -0.01; R* = 0.94) contains four parameters (Fig. 2c). These include 
Sterimol parameter B,, which describes the minimal radius of R, and 
three computationally derived infrared frequencies (Fig. 2c, d). The 
derived model is highly predictive for both isolated steric effects and 
substrates containing concomitant steric and electronic changes. 

The extreme outliers in the Sterimol analysis are R = -CCl3, -Fs;Ph 
and -4t-Bu-Ph (where t-Bu is tert-butyl) (Fig. 2b). If we consider the 
Sterimol parameters as descriptors of repulsive steric interactions 
within the catalytic reaction site, a geometry-based parameter would 
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not suffice to define the first two substituents. Because vibrational anal- 
ysis takes charge distribution into account, and adds a directional 
aspect to steric interactions, it is able to address electronically diverse 
substituents. These results are consistent with a mechanistic hypo- 
thesis asserting a direct interaction between the peptide catalyst and 
the bisphenol substituent at the selectivity-determining step of the 
reaction'*”’; such an interaction would be sensitive to substituent elec- 
tronegativity. Inspecting the model reveals that in the cases of R= 
-CCl, and -F;Ph, all three vibrations are shifted to a higher frequency 
relative to their respective non-halogenated steric homologues, -CMe; 
and —Ph (Fig. 2e). This generalized trend could indicate that the vibra- 
tional parameters are functioning as an electronic correction to the 
steric description. 

It has been proposed that the minimum radius B, (Fig. 1a) describes 
steric effects proximal to the phenol rings, wherein the substituent applies 
torsion on the rings in a propeller-like strain’* (Fig. 2a). However, the 
inability to predict the enantioselectivity of -4t-Bu-Ph using Sterimol 
analysis points to a limitation in applying this parameter to substituted 
aromatic rings. Sterimol values define this group similarly to a -CMe3 
group in terms of B,. However, on the basis of both the distal location 
of the -CMe; group in —4t-Bu-Ph and the empirically observed enan- 
tioselectivity (enantiomeric ratio of 82:18 for -4t-Bu-Ph versus 97.5:2.5 
for -CMe3), this substituent more closely resembles an unsubstituted 
phenyl ring (enantiomeric ratio, 75:25). This significantly restricts the 
use of Sterimol values for evaluating steric effects in aromatic systems. 
Because infrared ring-stretching vibrations are modulated in response 
to substituent steric effects, they represent an auxiliary, directional aspect 
of the substituent geometry and allow for the prediction of groups such 
as —4t-Bu-Ph. 

To explore the potential of infrared vibrational analysis further, as a 
second case study we evaluated our recently reported enantioselective 
hydrogenation of 1,1-diarylalkenes”* (Fig. 3a). The original scope in- 
cluded 1,1-diarylalkenes in which ring substitution patterns were mod- 
ulated to explore the origin of enantioselectivity (Fig. 3d). An unusual 
observation was that 3,5-dimethoxy substitutions were required to achieve 
high enantioselectivity (Fig. 3b). We began examining the origin of this 
observation using our new technology by including twelve of the ori- 
ginally reported substrates in the training set. This set was selected on 
the basis of both the rings’ substitution patterns, in terms of substituent 
position (that is, meta or para), and the diversity of steric and electronic 
effects, as well as a significant enantioselectivity range. For external 
validation, four of the original substrates were combined with eight new 
substrates, which were specifically designed to introduce additional 
structural patterns. 

Mechanistically relevant infrared vibrations were proposed accord- 
ing to structural features of the various substrates. Potential parameters 
for modelling enantioselectivity were six ring vibrations analogous to 
those used to describe the bisphenol rings in the previous case study. 
Additionally, three alkene vibrations were included, because the alkene 
is directly engaged with the catalyst during the transformation. Eval- 
uating the minimized structures of the diarylalkenes showed that aryl 
substitution affects ring torsion. Therefore, the measured distance bet- 
ween adjacent ortho-carbons on the geminal aryl rings was introduced 
into the parameter set (Fig. 3c). Finally, considering conjugation bet- 
ween the aryl rings and the alkene, we incorporated intensities and 
frequencies of two vibration modes that involve both groups (Fig. 3c). 

A predictive model (slope, 0.95; intercept, 0.06; R= 0.88) with three 
parameters was determined (Fig. 3b, Supplementary Information). Of 
particular note is that vibrational intensities were identified as relevant 
descriptors, in contrast to the bisphenol model, where frequencies are 
sufficient. Intensity and frequency measure different, but interrelated, 
effects; frequency is dependent on force constants, bond energy and 
molecular distances, whereas intensity is a derivative of the dipole mo- 
ment influenced by molecular symmetry and electronic structure”"*!””°. 
The three parameters included in the model are the torsion distance, 
two infrared intensities and a cross-term between these intensities. The 
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Figure 3 | Using infrared vibrations to correlate enantioselectivity. 

a, Reaction scheme for the enantioselective hydrogenation of 1,1-diarylalkenes. 
b, Mechanistically derived normalized model for the correlation of steric and 
electronic features of the entire set of substrates to enantioselectivity. 

c, Vibrational intensities and distance measurement used for the correlation of 
enantioselectivity: d,.., tortional distance between ortho-positions; 

isym» Symmetric central C-C stretch between rings and double bond; 

ic_y, antisymmetric alkene C-H stretch. d, Selected substrates from the 
training set and their respective enantiomeric ratios (e.r.). e, Normalized model 
derived for a subset of all substrates with a 3,5-methoxy motif on one aryl ring 
and para-substitution on the geminal aryl. f, Normalized model derived for a 
subset of all substrates with various meta-substitution patterns on both rings. 


first intensity, i,,m, belongs to the symmetric central C-C stretch bet- 
ween the geminal aryls and the alkene, which is proposed to describe 
conjugation, and the second, ic_}, is the antisymmetric alkene C-H 
stretch (Fig. 3c). 

To assess the effect of each parameter on the prediction of select- 
ivity, two structurally distinct subsets of substrates were evaluated. 
Hammett o-constants for para-substituents do not adequately des- 
cribe the enantioselectivity in a first subset where one aryl ring contains 
a 3,5-dimethoxy motif and the para-position of the other is modulated 
(8; Fig. 3e). Deconstructing the original model reveals that this series is 
defined by only the torsion distance, d,,., and the alkene stretch, ic_y 
(Fig. 3e). Although the source of the torsional effect may be electronic, 
these two observations imply that the role of para-substitution in 
determining enantioselectivity is not purely electronic. Interestingly, 
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modulation of meta-substituents cannot be described by the same model, 
and a new model for this second substrate subset relies only on the 
isym Vibration (9; Fig. 3f). It has been proposed” that meta-substituents 
could act as directing groups by pre-coordinating to the catalyst. On 
the basis of this mechanistic hypothesis, the identified intensity, isyms 
may denote a selectivity-imparting, direct interaction of meta-substituents 
with the catalyst. Together, these effects serve to define substrates in 
which both meta- and para-positions have been perturbed sterically 
and electronically. Notably, in the full model (Fig. 3b), a cross-term of 
the two intensities implies a relation between these components for the 
description of selectivity, suggestive of synergistic effects between the 
aryl substituents and the alkene. 

In the final case study, we sought to expand this technique by ex- 
ploring site selectivity and simultaneously interrogating two reactants. 
This is demonstrated in the context of our recently reported redox- 
relay oxidative Heck arylation, which is highly enantioselective for a 
broad range of reaction partners and uses a simple chiral ligand’””* 
(Fig. 4a). An interesting mechanistic aspect of this process is the site 
selectivity of the organometallic addition, leading to constitutional iso- 
mers (defined as y and 8). As observed empirically, the respective natures 
of the alkene and aryl sources both affect site selectivity. Specifically, a 
Hammett correlation has been observed as a function of the boronic- 
acid derivative (but only for meta- and para-substituted examples), with 
the most electron poor leading to high site selectivity for y-addition’’. 
Furthermore, as the distance between the alkene and the alcohol is in- 
creased, the selectivity is reduced. The latter result was initially corre- 
lated to the difference in '*C NMR chemical shifts of the two alkene 
carbons, also suggesting an electronic origin to site selectivity. Finally, 
although we were unable to correlate the alkene substituent effect quan- 
titatively, we observed a qualitative trend of larger substituents leading 
to higher site selectivity for y-addition. These results suggest that both 
steric and electronic effects affect site selection, providing an exciting 
platform to utilize our new infrared-vibration-based parameters. 

We elected to construct a model for this reaction in three stages. The 
first stage was aimed at determining whether infrared vibrations can be 
combined with Hammett analysis to model site selectivity as a func- 
tion of both boronic-acid and alkene substituents. A specific hypoth- 
esis was that the alkene C=C stretching frequency would be sensitive 
to substituent changes at R (Fig. 4b). To test this, we designed” an 
eight-membered library by modulating the alkenol sterically (R = i-Pr 
(iso-propyl), Et (ethyl), Me) and the boronic acid electronically (E = 
CO Me, F, OMe). Using the C=C stretching frequency in combina- 
tion with the Hammett o-value provided an excellent model (slope, 
0.98; intercept, 0.02; R’ = 0.98; Fig. 4b). These results suggest that the 
C=C stretching frequency effectively describes the substituent effect 
on the alkene. 

The second stage of this study was directed at the more profound 
question of describing simultaneous steric and electronic variation. 
Specifically, we wished to assess whether vibrational analysis could 
be a substitute for Hammett analysis, yet also allow for the modelling 
of multiple substituents on the ring or ortho-substitution—a signifi- 
cant deficiency of Hammett analysis. Inspired by this question and 
the classic Hammett analysis, we elected to use infrared vibrations of 
substituted benzoic acids as a generalized parameter system and as a 
mimic for arylboronic acids. We chose to examine a single homoallylic 
alcohol in combination with 12 distinctive aryl boronic acids, includ- 
ing ortho-substituted examples. An excellent model was derived, 
incorporating a ring stretch intensity, the carboxylic acid C=O stretch 
frequency and a cross-term between the two (slope, 0.95; intercept, 
0.08; R? = 0.94; Fig. 4c). The cross-term is the most important term, 
with the largest regression coefficient, suggesting that, together, these 
two vibrations synergistically serve to describe the various effects induced 
by the boronic-acid component. Markedly, substrates with ortho- 
substitution were effectively predicted. 

As the final stage, a model was desired to account for all the products 
with measured site-selectivity ratios, including those with different 
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Figure 4 | Using infrared vibrations to correlate site selectivity. 
a, Enantioselective redox-relay Heck reaction scheme and mechanistic 
considerations. b, Correlation of site selectivity to the arene Hammett value and 


alkene geometries, chain lengths, R groups and arylboronic acids (Fig. 5a). 
A model was developed that effectively predicts 17 external validations 
(slope, 1.01; intercept, 0.03; R? = 0.92; Fig. 5b, c). The terms describing 
the alkene are the sp” C-H symmetric stretch and a cross-term between 
it and the C=C stretch. In this case, the difference relative to those 
above (Fig. 4b, c) is that this inclusive model must adequately describe 
the effect of chain length and alkene geometry. Impressively, only two 
examples of these variations are included in the training set. The arene 
is described by the same benzoic-acid ring stretch intensity as above, as 
well as its cross-term with the carbonyl stretch, but an additional ring 
frequency improves the model. Again, this is presumably due to the 
more complex interactions between the diverse alkenols and the boronic 
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Figure 5 | Developing a comprehensive model. a, Enantioselective redox- 
relay Heck general reaction scheme. b, The training set used to develop a model 
describing site selectivity for products with different alkene geometries, 

chain lengths, R groups and aryl boronic acids. c, Normalized model and 
vibrational terms used for the correlation of enantioselectivity. The two terms 
on the right, C = C stretch and C-H intensity, are used to describe the alkenol, 
and the three on the left, C = O stretch, ring intensity and ring stretch, describe 
the aryl. 
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Measured site selectivity (AAG#) 


alkenol double-bond stretch (normalized model). c, Correlation between site 
selectivity with different aryl boronic acids and vibrational parameters that 
describe the aryl moiety (normalized model). 


acids. An intriguing feature of this type of more comprehensive mod- 
elling is the potential ability to apply it in extrapolative fashion to pre- 
dict the performance of new substrates. Because this is a reaction under 
further investigation, the present analysis will inform mechanistic 
hypotheses and reaction development. 

In this work, we have introduced a method to model reaction select- 
ivity in catalytic systems based on vibrational parameters. The value of 
this approach is that simultaneous changes to steric and electronic 
properties can be accounted for and successfully evaluated, overcom- 
ing a significant limitation of classical parameter systems. 


METHODS SUMMARY 


To execute a strategy that involves computationally derived infrared vibrations as 
molecular parameters for the mathematical interpretation of selectivity in catalytic 
reactions, two steps were performed. The first step was identifying vibrations that 
are sensitive to substituent effects within a given reaction. Realizing this step re- 
quires an initial hypothesis for selecting mechanistically relevant vibrations, because 
complex molecules have an abundance of vibrational modes. Additionally, the 
selected vibration modes must be identified for all modulated substrates, thus repre- 
senting the same effect throughout the data set. The abundance, superposition and 
coalescence of vibration modes lead to complex infrared spectra. This complexity 
stresses the utility of computed infrared spectra because individual vibrations can 
be visualized, allowing us to pick out the correct vibration terms for each set of 
molecules studied. Thus, to produce predictive linear regression models, in each 
case study we selected specific infrared vibrational modes with a mechanistic 
hypothesis in mind. Indeed, we found that a hypothesis-based interrogation of the 
experimental space is required to suggest parameters that can yield predictive, mech- 
anistically relevant models. The second step requires design and evaluation of an 
empirical training set, which encompasses a systematic variation of substituent 
properties”’. The experimental output, enantiomeric or site-selectivity ratios, were 
mathematically modelled through linear regression techniques to reveal which of 
the proposed parameters allow for the prediction of new outcomes” (Methods and 
Supplementary Information). The models produced were evaluated for their good- 
ness of fit, and their robustness is demonstrated by external validations’ goodness 
of fit. The nearer the R” and slope values are to one (indicating a tight, one-to-one 
correlation between predicted and measured outcomes) and the nearer the inter- 
cept is to zero (indicating minimal systematic error), the more robust the model. Of 
the potential models, those containing a minimal number of parameters were 
preferred, because this allows for a mechanistically informative interrogation. 


Online Content Any additional Methods, Extended Data display items and Source 


Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


In this work, we develop linear regression models using parameters that pertain to 
steric and electronic properties of the substrates in each case study. For the devel- 
opment of a model, mechanistically relevant parameters were equated to the en- 
antioselectivity or site selectivity in terms of relative energy—AAG* (kcal mol '). 
The first step requires selection of an adequate library that could be used as a 
training set for the identification of a model that describes selectivity. An oversized, 
poorly designed training set could lead to the inclusion of parameters that describe 
idiosyncrasies of the training set rather than true reactivity or selectivity trends”’. 
Design-of-experiment principles guide training set selection, that is, systematic 
variation of substituent properties”. 

For the case study on the desymmetrization of bisphenols, steric models were 
previously reported’*”” and, hence, we were able to minimize the substrate training 
set by comparing the Sterimol B, and enantioselectivity values of these studies. It 
was assumed that because the B, values have the greatest effect on enantioselec- 
tivity, substrates with similar B, and enantioselectivity values are redundant in the 
training set. In the subsequent case studies, we applied chemical intuition con- 
cerning steric and electronic perturbations to select a training set. A general con- 
sideration in picking training sets is choosing a minimal number of substrates that 
exemplify many of the structural variations present in the entire data set. Having 
this in mind, it should be noted that we purposefully excluded some substrates 
from the training sets, with the intention of interrogating the ability of a model to 
externally predict these substrates. Such an external validation establishes the gen- 
erality of a model, because it is able to predict variations that were not explicitly 
incorporated in the training set. One such deliberate omission of substrates was 
ortho-substituted boronic acids in the training set in the case study on Heckarylation. 

The second step includes four stepwise regression algorithms (for details, see 
Supplementary Information) that assess the significance of each parameter by apply- 
ing statistical criteria. To realize this assessment, each set of parameter values is nor- 
malized by subtracting its respective mean and dividing by its standard deviation. 
The four stepwise regression algorithms are built into the MATLAB statistical tool- 
box and add or remove normalized parameters from an initial model according to 
a Pvalue threshold**”’. Additional suggestions for models are inputted manually 
on the basis of the results of the four preliminary algorithm runs or on mechanistic 
hypotheses. A linear fit is performed to probe each manually suggested model, and 
each suggestion is examined as an initial model for a subsequent stepwise regres- 
sion iteration to seek a more effective model. After identifying several statistically 
probable models, an external validation is carried out to determine the predictive 
nature of the proposed models for substrates that are not included in the training 
set. This procedure allows us to propose statistically probable models that describe 
the training set, and then to assess the predictive efficacy of each model towards an 
external validation set. 

The theoretical context of this work is epitomized by Hammett’s seminal obser- 
vation that the acidity of benzoic acid derivatives can be correlated to new equilibria 
and, ultimately, to reaction rates**’” (Fig. 1a). Such types of free-energy relation- 
ships have been an extraordinarily revealing tool for reaction study!*?'*!8*4°8, 
However, a significant limitation is the inability of a single parameter to describe a 
substituent that introduces perturbations to more than one molecular property. As 
an example, Hammett parameters can be used only to describe the ortho-position on 
a benzene ring in cases where there is no steric effect involved, limiting their 
applicability to data sets that contain solely electronic trends’’. Conversely, steric 
molecular descriptors, which relate to the spatial arrangement of atoms in a mole- 
cule, suchas Taft**’, Charton>“’” or Sterimol”**“* parameters (Fig. 1a), can be used 
only to interrogate a reaction trend in the absence of significant electronic effects. 


LETTER 


To overcome the aforementioned limitations, we applied a parameter set of 
molecular vibrations for the quantitative analysis of concurrent steric and elec- 
tronic structural perturbations. Therefore, in each of the disclosed case studies, we 
selected and applied several infrared frequencies and intensities, as well as addi- 
tional mechanistically relevant parameters that are presumed to have a bearing on 
selectivity. This approach is reminiscent of developing quantitative structure activ- 
ity relationship (QSAR) regression models for the exploration of chemical or bio- 
logical systems*’**. Yet, so far, infrared vibrations have not been an established 
parameter of choice in QSAR“ >. As an example, the evaluation of the fingerprint 
region of experimental infrared spectra as a whole (between 1,500 and 600 cm~ 1), 
underperformed relative to models derived from typical QSAR parameters”’. Never- 
theless, by proposing infrared vibrational-energy-derived parameters that repres- 
ent mechanistically pertinent features of the substrates and reactions at hand, it is 
possible to produce predictive, meaningful correlations. 
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activation of amines 
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Achieving site selectivity in carbon-hydrogen (C-H) functionali- 
zation reactions is a long-standing challenge in organic chemistry. 
The small differences in intrinsic reactivity of C-H bonds in any 
given organic molecule can lead to the activation of undesired C-H 
bonds bya non-selective catalyst. One solution to this problem is to 
distinguish C-H bonds on the basis of their location in the mole- 
cule relative to a specific functional group. In this context, the acti- 
vation of C-H bonds five or six bonds away from a functional group 
by cyclometallation has been extensively studied'"'*. However, the 
directed activation of C-H bonds that are distal to (more than six bonds 
away) functional groups has remained challenging, especially when 
the target C-H bond is geometrically inaccessible to directed metal- 
lation owing to the ring strain encountered in cyclometallation’*”’. 
Here we report a recyclable template that directs the olefination and 
acetoxylation of distal meta-C-H bonds—as far as 11 bonds away—of 
anilines and benzylic amines. This template is able to direct the meta- 
selective C-H functionalization of bicyclic heterocycles via a highly 
strained, tricyclic-cyclophane-like palladated intermediate. X-ray and 
nuclear magnetic resonance studies reveal that the conformational 
biases induced by a single fluorine substitution in the template can be 
enhanced by using a ligand to switch from ortho- to meta-selectivity. 

The selective functionalization of inert C-H bonds at different sites 
of organic molecules provides an opportunity for the introduction of 
diverse structural modifications and the development of novel retro- 
synthetic disconnections. However, the widespread application of C-H 
functionalization in organic synthesis is hampered by a lack of catalysts, 
reagents and methodologies that enable the site-selective functionali- 
zation of C-H bonds, which often have very subtle differences in intrinsic 
reactivity. We have broadly focused on the development of metal-catalysed 
C-H activation reactions that are directed by weakly coordinating func- 
tional groups’. In analogy to the principles of proximity-driven metallation’, 
this type of methodology enables the selective functionalization of C-H 
bonds that are five or six bonds away from the directing atom, through 
cyclometallation® *. Although this approach has enabled the discovery 
of numerous transformations over the past decade, the functionaliza- 
tion of C-H bonds that are located farther away from the coordinating 
functional group remains a largely unsolved problem in organic synthesis, 
especially when their locations do not permit cyclometallation owing 
to geometric strain’. 

Recently we developed an end-on-coordinating, nitrile-based template 
that is able to direct Pd(11)-catalysed meta-selective olefination and aryla- 
tion of hydrocinnamic acids'*"’. This discovery led us to explore three 
key questions: first, whether this new end-on template approach can be 
applied to other substrate classes; second, whether other types of trans- 
formation using different catalytic manifolds can be achieved using end- 
on templates; and, third, whether there are critical and general underlying 
principles for the design of an effective template to direct remote C-H 
activation. In this context, we recognized that the selective activation 
of C-H bonds at C7 of tetrahydroquinolines is a conceptually intri- 
guing and synthetically important challenge. A novel template will be 
required to accommodate a highly strained intermediate with a tricy- 
clic cyclophane structure (Fig. 1a). 


Here we report the rational design of a nitrile-containing template 
that directs C7-selective C-H activation of tetrahydroquinolines (Fig. 1). 
By systematically modifying the structure of the template, we identified 
the template conformation as a critical factor in favouring remote meta- 
or proximate ortho-selectivity (Fig. 1b). Remarkably, by tuning the prop- 
erties of the Pd(11) catalyst through use of N-acetyl-glycine (Ac-Gly-OH) 
as a ligand, the pre-existing conformational bias in the template can 
be further amplified to achieve remote meta-C-H olefination in excel- 
lent yield and with high levels of site selectivity (Fig. 1c). This opti- 
mized template is broadly applicable to the remote C-H activation of 
2-phenylpyrrolidines, 2-phenylpiperidines and other aniline-type sub- 
strates, despite the intrinsic electronic biases in these substrates that 
favour ortho-functionalization (Fig. 1d). In addition to meta-C-H ole- 
fination via a Pd(11)/Pd(0) redox cycle, we were also able to demon- 
strate meta-C-H acetoxylation via Pd(m)/Pd(1v) catalysis using this 
template. This template can be easily installed and later recycled, similar 
to chiral auxiliaries that are widely used in organic synthesis, such as the 
well-known Evans oxazolidinone. This work paves the way for practical 
applications of remote C-H activation through template control. 

A procedure exists for the meta-selective olefination of hydrocinna- 
mic acids using a novel end-on 2-aminobenzonitrile template’*. Cru- 
cial for the meta-selectivity, the directing nitrile group is in extended 
conjugation with the carbonyl moiety of the substrate, which positions 
the nitrile group in close proximity to and coplanar with the target meta- 
C-H bond. However, in translating this insight to the meta-selective 
olefination of tetrahydroquinolines, we needed to design an entirely 
novel end-on template and we faced several considerable challenges in 
determining the optimal structural design. First, the hypothetical pal- 
ladation intermediate would involve a highly strained intermediate with 
a tricyclic cyclophane structure (Fig. 1a). Second, although a simple amide 
linkage is desirable for practical attachment of the template, we rea- 
lised that the amide group could potentially favour the activation of 
the ortho-(C8) position, an established mode of reactivity for anilide- 
type substrates’. Moreover, amine substituents are well-known ortho/ 
para directors in electrophilic aromatic substitution reactions, including 
electrophilic palladation. Third, to avoid the pitfall of over-engineering, 
we hoped to develop a simple amide template with an sp*-hybridized 
backbone without having to build in an extended conjugation. We recog- 
nized that such a template could lead our substrates to exist in multiple 
interconverting conformations, each with a low equilibrium population. 
This would translate into a high entropic barrier for formation of the 
highly organized transition state required for cyclopalladation. We there- 
fore aimed to acquire an improved understanding of how the confor- 
mation of non-constrained atoms in a template can be manipulated to 
favour remote C-H activation. In addition, we hoped to develop a cat- 
alyst to recognize and harness subtle conformational biases and amplify 
pre-existing template-induced preferences for meta-selectivity. 

To begin our investigation, we attached the simple nitrile templates 
T,-T3 to tetrahydroquinoline and tested these substrates in a model 
reaction, the Pd(11)-catalysed C-H olefination (Fig. 2a, b). Although the 
reaction of 1 did not provide any olefinated products, 2 and 3 afforded 
mixtures of positional isomers that are difficult to separate. The addition 
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Figure 1 | Design of a versatile template to direct meta-C-H activation. 

a, The challenge of remote meta-C-H activation of bicyclic heterocycles, 
illustrated by previously inaccessible C-H bonds and highly strained 
tricyclic cyclophane intermediates. DG, directing group. b, Proposed 
conformation-controlled meta-C-H activation. Me, methyl. c, Amplification 


of our previously identified ligand, Ac-Gly-OH, did not improve the 
level of selectivity with these templates. We attributed the poor site selec- 
tivity to either a lack of conformational rigidity in the template back- 
bones (T, and T2) or the template being too short to reach the target 
C-H bond (T3). We subsequently explored the «-hydroxy template 
structure (T4, Ts and T.) and observed an encouraging trend favour- 
ing C7 selectivity in C-H olefination (Fig. 2b). Although exclusive C8- 
olefination was observed using template T, (substrate 4), the use of 
templates T; (substrate 5) and T, (substrate 6a) afforded C7-olefinated 
product with selectivities of 11% and 20%, respectively. Although the 
level of meta-selectivity was still poor with these templates, the obser- 
vation that a single fluorine substituent, in Tg, nearly doubles the meta- 
selectivity prompted us to investigate the origin of this phenomenon. 

Because fluorine substituents can lead to pronounced changes in mole- 
cular conformation”®”’, we studied the conformations of 4-6 in the 
solid state and in solution. X-ray crystal structures of 4 and 5 showed 
that the carbonyl group is perfectly oriented to perform ortho-C-H acti- 
vation. However, in the X-ray crystal structure of 6a the carbonyl group 
is oriented away from C7, which presumably makes it better suited for 
the nitrile moiety to approach the meta-C-H bond (C8). Studies of the 
nuclear Overhauser effect showed that a similar conformational trend 
is present in solution (Supplementary Information). Steric hindrance 
from the gem-dimethyl substituents in 4 probably raises the activation 
energy for amide N-C bond rotation, leading to highly restricted intercon- 
version between the pro-ortho (ground state) and pro-meta conformations. 
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of meta-selectivity by use of a ligand. d, Scope of meta-C-H activation: 
olefination and acetoxylation. We highlight the C-H bonds and the palladation 
intermediate (blue), the newly formed bonds (red) and the importance of 
fluorine substitution to this reactivity (magenta). 


In contrast, the amide C-N bond in 5 can more freely rotate, leading to 
some of the meta-C-H olefination product. Notably, introduction of 
the «-fluorine substitution in 6a leads to a conformational switch wherein 
the carbonyl group is directed away from the ortho-C-H bond. Increas- 
ing the proportion of this conformation seems to improve the rate of 
meta-C-H activation; however, a low barrier for interconversion between 
the pro-meta and pro-ortho conformers still results in substantial levels 
of ortho-C-H activation. 

Wereasoned that the meta-selectivity of 6a could be amplified by the 
use of a bulkier and more electron-rich catalyst because the C7 position 
is less sterically hindered and more electron poor than C8 (ref. 22). In 
fact, we found that the use of Ac-Gly-OH as a ligand in conjunction with 
template T, improves the level of meta-selectivity from 20% to 92% 
(Fig. 2b). After deprotection, the meta-olefinated product 7a was iso- 
lated in 75% yield. Under these ligand-enhanced reaction conditions, the 
meta-selectivity of substrate 5 is also markedly increased (11% to 84%). 
In contrast, olefination of substrate 4 with Ac-Gly-OH as a ligand pro- 
vides the olefinated product with 91% ortho-selectivity (92% yield). These 
results suggest that, although the Ac-Gly-OH ligand can promote meta- 
selectivity, the appropriate conformation of the template is a prerequisite 
for achieving high levels of meta-selectivity. We have also tested the ana- 
logue of T, that contains «-difluoro substituents, in an attempt to improve 
the meta-selectivity further. We obtained a yield and meta-selectivity sim- 
ilar to those gained using Tg, indicating that electronic effect does not have 
a major role in controlling meta-selectivity (Supplementary Information). 
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Figure 2 | Development of templates to direct meta-C-H olefination. 

a, Olefination of tetrahydroquinoline derivatives. b, Six representative 
templates designed to screen the meta-C-H olefination. 

c, Tetrahydroquinolines with a variety of substitution patterns appended 
with template T, (6a-6h) undergo facile meta-C-H olefination. The yields 
of 2 and 3 are NMR yields with CH,Br, as the internal standard. The selectivity 
is not determined, owing to multiple olefinated products detected. The isolated 


Having established an optimal system for meta-selective C-H ole- 
fination, we proceeded to investigate the scope of tetrahydroquinolines 
(Fig. 2c). Substitution at C2 improves the meta-selectivity (7b, 97:3). 
Substitutions at the C5 and C6 positions were well tolerated (7c-7f). 
C8 substitution decreases the level of meta-selectivity to 78:22, presumably 
owing to the steric hindrance of the C7 position (7g). Dihydrobenzoxazine 
6h was also selectively meta-olefinated to give the desired product in 
71% isolated yield (7h). It is worth noting that the C7 positions of these 
heterocyclic skeletons are very difficult to functionalize, and that this 
methodology could enable the synthesis of a new subclass of these med- 
ically active heterocycles. 

Wesubsequently explored the use of our template system in the meta- 
selective olefination of anilines. The directed ortho-C-H olefination of 
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yields of other olefinated products are shown along with the selectivity 
(combined yields are shown in b). See Supplementary Information for 
experimental details. Selectivity of the olefinated products was determined by 
'H NMR analysis or gas chromatography mass spectrometry (GCMS) using a 
flame ionization detector. The variance is estimated to be within 5%. HFIP, 
hexafluoroisopropanol. 


acetanilides demonstrates the high reactivity of the ortho-position of 
these substrates towards palladation’’. A meta-selective C-H olefination 
of anilines would offer a complementary retrosynthetic disconnection 
for the synthesis of substituted anilines. Thus, aniline was attached to 
the optimized template T, and subjected to the established olefination 
conditions (Fig. 3). A mixture of mono- and di-olefinated products 
(9a nono and 9aq;) was obtained in 88% combined yield with 99% meta- 
selectivity, suggesting that this template can successfully override an 
electronic bias towards ortho-palladation. A variety of electron-rich and 
electron-poor meta-substituted anilines gave the mono-olefinated pro- 
ducts in good to high yields with the meta-selectivity ranging from 94% 
to 99% (9b-9g). Ortho-substituted anilines were selectively olefinated 
at the less hindered position to give mono-olefinated products (9hono 


13 MARCH 2014 | VOL 507 | NATURE | 217 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
we "= Pd(OAc), (10 mol%) 
Ol Ac-Gly-OH (20 mol%) OH (20 mol%) 
el 
t R " BOOAE@ equ). (3 equiv.) 
paar” 1.2-2.0 equiv. HFIP, 90 °C, 24-48 h 
b 
Eto why EtO Bly 
Ay onor 43% Ts 9b,, 83% 
m:(0 + p) = 99:1 milo + 0' +p) = 99:1 
9a,,, 45% 


(m, Pe 


EtO ile EtO ely a 


9e, 78% 9f, 67% 
m:(0 + 0' + p) = 96:4 m:(0 + 0' + p) = 94:6 
8% 
R_LOcr3 MeN 
EtO,C A * eee ~N7 
i oO I 
T6 Te 
9inonor 80% Simona: 46% 
m:(o + m' + p) = 98:2 m:o = 99:1 
914, 5% 94; 40% 


(m, m'):others = 99:1 


Et0,C Xm7 ~N7 
I 
} 


(m, m'):others = 95:5 


ie) if 
m 
Et0,0C- Sn N~ 
1 
Tg 


rs ' : 
De Oe 
1 67 oO: 
: oy i 
9a-9t Litcneieblecn takeaway : 
F 
Et0,C“S“m Et0,C “Sm N~ 
fo) 
i 
9c, 81% 9d, 84% 
m:(0 + o' + p) = 98:2 m:(0 + 0' + p) = 96:4 


CF, m' Me 
et adigh 

Et0,C“S io 

Et0,C Ym NZ 2 ms. NM 
fo) I Te 


6 
9g, 80% 9h 


‘mono’ 53% 
m:(0 + 0' + p) = 94:6 m:(0 + m' + p) = 95:5 
9h,;, 19% 
(m, m'):others = 98:2 
"eo OO 
Et0,C NF NEO, SN 
I 1 
Tg Tg 
9k nono “2% Ol nono 49% 
m:o = 99:1 m:0 = 98:2 
9k4i, 12% Ig 22% 


(m, m'):others = 96:4 


1@) : Me 
oO. ee 
Et0,C Sim N* — Et0,C- YF N~ 
1 oO 
Tg ie 


(m, m'):others = 94:6 


9M, ono 96% 9N, nono 92% 90, nono: 39% 9p, 56% 
mz:o = 99:1 m:o = 99:1 mz:o = 99:1 mz:(0 + p) = 95:5 
9m: 5% 9n,j, 8% 90,;, trace 
(m, m'):others = 98:2 (m, m'):others = 97:3 
Me Me Me Me 
Cl F 
wo we ~~ 
Et0,0- Xm N7 E10,0- Yay nN Et0,C- Xn ~N Et0,C Xi >N 
Tg Ts T, Tg 
9q, 73% 9r, 82% 9s, 63% 9tronor 38% 
m:(0 + p) = 92:8 m:(0 + p) = 97:3 m:(0 + 0' + p) = 90:10 m:o = 95:5 
9t yj, 32% 
(m, m'):others = 94:6 
C >) 
Me Me Me 
' o' 
Oo C 
QO, 0} Pa 
RS NT 0287 Sn N7 me ON 
EtO” Sey oa 1 ° ! T 
Tg Ph Tg FC 6 
9b,, 76% 9b,, 78% 9b,, 51% 
oe +p) = 99:1 m:(0 + 0' + p) = 99:1 m:(o + 0' + p) = 99:1 
Me 
wh hey : 
N a 
mse N 
Tg 
9b,, 50% 9b,, 60% 9b,, 82% 9b,, 84% 
m:(0 + 0' + p) = 95:5 m:(0 + 0' + p) = 99:1 m:(0 + 0' + p) = 99:1 m:(o + 0' + p) = 99:1 ) 


Figure 3 | Template-directed remote C-H olefination of N-methylanilines. 
a, Anilines with a variety of substitution patterns were found to undergo facile 
meta-C-H olefination. b, In the box, electron-deficient olefins and various 
di-and tri-substituted olefins were compatible with the transformation in a. 
The isolated yields of the mono-olefinated product (and also the isolated yields 


and 9imono) accompanied by small amounts of the di-olefinated bypro- 
ducts. Despite the steric hindrance, excellent meta-selectivity was also 
obtained for a number of para-substituted anilines, albeit with varied 
mono- or di-selectivity (9j-90). Polysubstituted anilines were also smoothly 
olefinated at the remaining meta-positions to afford anilines 9p-9r con- 
taining complex 1,2,3,5-substitution patterns. Meta-olefination of 8d 
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of the di-olefinated product, when applicable) are shown along with the 
selectivity. See Supplementary Information for experimental details. Selectivity 
of the mono- and di-olefinated products was determined by "H NMR analysis 
and GCMS analysis using a flame ionization detector. The variance is estimated 
to be less than 5%. Ph, phenyl. 


was also carried out using 5 mol% Pd(OAc), to give the desired product, 
9d, in 60% isolated yield (Supplementary Information). The replace- 
ment of the N-methyl group by a readily removable benzyl-type protect- 
ing group was also tolerated, thus improving the utility of this reaction 
(9s and 9t). The hydrogenolysis of the benzyl-type protecting group also 
reduced the installed olefin unit to corresponding alkyl (Supplementary 
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Figure 4 | Meta-C-H acetoxylation of N-methylanilines and benzylamine 
derivatives. a, Anilines (8) with a variety of substitution patterns undergo facile 
meta-C-H acetoxylation. b, The isolated yield of the acetoxylated aniline is 
shown along with the selectivity. c, Benzylamines derivatives (1la-11d) were 
found to undergo facile meta-C-H acetoxylation. d, The isolated yield of the 


Information). Finally, a range of olefin coupling partners, including 
1,2-di-substituted olefins, were shown to be compatible with this trans- 
formation, demonstrating broader scope than the majority of directed 
ortho-C-H olefination reactions (9b,-9b7). 

The utility of our template-directed remote C-H activation approach 
in developing other types of C-C and C-heteroatom bond-forming 
reactions via different catalytic manifolds remained to be demonstrated. 
Thus, amine substrate 8a was subjected to various C-H oxidation reac- 
tion conditions (Fig. 4a, b). Notably, these transformations proceed via 
a Pd(11)/Pd(iv) redox chemistry as opposed to the Pd(0)/Pd(11) catalytic 
cycle of C-H olefination. We found that the use of PhI(OAc), oxidant® 
affords meta-acetoxylated amine 10a as the major product in 60% iso- 
lated yield. Excellent levels of meta-selectivity (90%-98%) were obtained 
with various substituted anilines. A variety of ortho-, meta- and para- 
substituents were tolerated in this transformation (10b-10j) although 
more electron-withdrawing substituents led to a depreciation in yield 


acetoxylated benzylamine derivative is shown along with the selectivity. See 
Supplementary Information for experimental details. The selectivities of the 
acetoxylated products were determined by 'H NMR analysis and GCMS 
analysis using a flame ionization detector. The variance is estimated to be less 
than 5%. 


(10f and 10g). Meta-selective acetoxylation of an ortho,meta-di- 
substituted aniline was also successful (10q). The versatility of our newly 
developed methodology is further demonstrated by the meta-selective 
acetoxylation of acyclic and cyclic benzylamines (Fig. 4c, d). Notably, 
the C-H bonds that are cleaved in these in benzylamine substrates are 
11 bonds away from the directing nitrogen atom, which is an unpre- 
cedentedly long distance for direct C—H activation. Meta-selective ace- 
toxylation of 2-phenylpyrrolidine 11c and 2-phenylpiperidine 11d is a 
potentially powerful methodology for accessing diverse structures of 
medicinally important heterocycles. The hydrolytic removal of the tem- 
plate also converted the acetate to hydroxyl group in one pot (Sup- 
plementary Information). 

Preliminary mechanistic studies of the acetoxylation of aniline also 
revealed that conformation of templates, again, has a decisive role in 
controlling the site selectivity. Control experiments showed that Ac-Gly- 
OH had only a minor beneficial effect on the yield of acetoxylation and 
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a negligible influence on the site selectivity. Remarkably, we found that 
the levels of meta-selectivity of the acetoxylation of aniline were respec- 
tively 46%, 66% and 92% when templates T,4, T; and T, were used in 
the absence of an amino-acid ligand, reflecting the intrinsic conforma- 
tional biases of these templates very clearly (Supplementary Information). 

We have developed a versatile template approach to direct the remote 
meta-C-H bond activation of tetrahydroquinoline, benzoxazines, ani- 
lines, benzylamines, 2-phenylpyrrolidines and 2-phenylpiperidines, all 
of which are commonly used as building blocks in drug discovery. Tem- 
plate T, can be readily installed through acylation of the amine substrates 
with the commercially available 2-(2-cyanophenoxy)-2-fluoroacetic 
acid (Sigma-Aldrich catalogue number: 791369). Owing to their elec- 
tronic biases, these amine substrates are incompatible with other known 
approaches for meta-C-H activation***°. We demonstrate that small 
conformational biases can be amplified by the judicious use of an amino- 
acid ligand to enhance meta-selectivity drastically, although the pre- 
cise role of the o-fluoro group on the template conformation remains 
hypothetical at this stage. 


METHODS SUMMARY 


The general procedure for template-directed meta-selective C-H olefination is 
as follows. A 35-ml sealed tube (with a Teflon cap) equipped with a magnetic stir 
bar was charged with amide substrate (0.10 mmol, 1.0 equiv.), Pd(OAc)2 (2.3 mg, 
0.01 mmol, 10 mol%), Ac-Gly-OH (2.4 mg, 0.02 mmol, 20 mol%) and AgOAc (50 mg, 
0.30 mmol, 3.0 equiv.). HFIP (0.5 ml) was added to the mixture, followed by ethyl 
acrylate (1.2-2.0 equiv.) and, finally, another portion of HFIP (0.5 ml). The tube 
was then capped and submerged into an oil bath pre-heated to 90 °C. The reaction 
was stirred for 24-48 h and cooled to room temperature (~25 °C). The crude reac- 
tion mixture was diluted with EtOAc (5 ml) and filtered through a short pad of 
Celite. The sealed tube and Celite pad were washed with an additional 20 ml of 
EtOAc. The filtrate was concentrated in vacuo, and the resulting residue was puri- 
fied by preparative thin-layer chromatography using hexanes, EtOAc and dichlor- 
omethane as the eluent. The positional selectivity was determined by GCMS with 
a flame ionization detector, and by 'H NMR analysis of the unpurified reaction 
mixture. Full experimental details and characterization of new compounds can be 
found in Supplementary Information. 
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Hydrous mantle transition zone indicated by 
ringwoodite included within diamond 
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The ultimate origin of water in the Earth’s hydrosphere is in the 
deep Earth—the mantle. Theory’ and experiments”* have shown 
that although the water storage capacity of olivine-dominated shal- 
low mantle is limited, the Earth’s transition zone, at depths bet- 
ween 410 and 660 kilometres, could be a major repository for water, 
owing to the ability of the higher-pressure polymorphs of olivine— 
wadsleyite and ringwoodite—to host enough water to comprise up 
to around 2.5 per cent of their weight. A hydrous transition zone 
may have a key role in terrestrial magmatism and plate tectonics*”’, 
yet despite experimental demonstration of the water-bearing capa- 
city of these phases, geophysical probes such as electrical conduc- 
tivity have provided conflicting results* '°, and the issue of whether 
the transition zone contains abundant water remains highly con- 
troversial''. Here we report X-ray diffraction, Raman and infrared 
spectroscopic data that provide, to our knowledge, the first evidence 
for the terrestrial occurrence of any higher-pressure polymorph of 
olivine: we find ringwoodite included in a diamond from Juina, Brazil. 
The water-rich nature of this inclusion, indicated by infrared absorp- 
tion, along with the preservation of the ringwoodite, is direct evid- 
ence that, at least locally, the transition zone is hydrous, to about 1 
weight per cent. The finding also indicates that some kimberlites 
must have their primary sources in this deep mantle region. 

Samples of mantle-derived peridotites show that olivine (Mg2SiO.) 
is the dominant phase in the Earth’s shallow upper mantle, to a depth 
of ~400km (ref. 12). At greater depths, between approximately 410 
and 660 km, within the transition zone, the high-pressure olivine poly- 
morphs wadsleyite and ringwoodite are thought to dominate mantle 
mineralogy owing to the fit of seismic discontinuity data to predictions 
from phase equilibria’*’*. No unretrogressed samples of any high- 
pressure olivine polymorph have been sampled from the mantle, and, 
hence, this inference is highly likely, but is unconfirmed by sampling. 
Sampling the transition zone is important because it is thought to be 
the main region of water storage in the solid Earth, sandwiched bet- 
ween relatively anhydrous shallow upper mantle and lower mantle*’. 
The potential presence of significant water in this part of the Earth has 
been invoked to explain key aspects of global volcanism® and has sig- 
nificant implications for the physical properties and rheology of the 
transition zone*’""*. Finding confirmatory evidence of the presence of 
ringwoodite in Earth’s mantle, and determining its water content, is an 
important step in understanding deep Earth processes. 

The discovery of ultradeep diamonds, originating below the lith- 
ospheric mantle’**’, allows a unique window into the material con- 
stituting the Earth’s transition zone. As such, these diamonds should 
provide the best opportunity for finding both wadsleyite and ring- 
woodite. Moreover, several studies have reported olivine that may have 
originated as a higher-pressure polymorph**”. 

In this study, we focused on diamonds from the Juina district of Mato 
Grosso, Brazil, in a search for ultrahigh-pressure inclusions. Alluvial 


deposits centred on tributaries East of the Rio Aripuana, Juina District, 
contain abundant diamonds that originate in the Earth’s transition 
zone and lower mantle!*"?”*”9, 

Diamond JUc29 is a 0.09 g, colourless/light-brown, irregular crystal 
(Extended Data Fig. 1) from deposits of the Rio Vinte e Um de Abril, 
downstream from kimberlite pipe Aripuana-01. It exhibits a high de- 
gree of surface resorption, is moderately plastically deformed and its 
nitrogen content is below detection by infrared spectrometry; that is, 
the diamond is type Ia. These are all characteristics of most ultradeep 
diamonds from Juina™*. A crystal of greenish appearance and ~40 pm 
in its maximum dimension was located optically in the diamond (Extended 
Data Fig. 1). Synchrotron X-ray tomography shows the inclusion to 
form part of a pair, with a Ca-rich and a Fe-bearing phase immediately 
adjacent (Extended Data Fig. 2). Single-crystal X-ray diffraction of the 
Fe-bearing phase revealed the main four diffraction peaks of ringwoo- 
dite, in their relative order of expected intensity’, that is, in descending 
order of intensity, the (113) plane at 2.44 A, the (440) plane at 1.40 A, 
the (220) plane at 2.81 Aand the (115) plane at 1.51 A (Extended Data 
Fig. 3). The expected fifth peak at about 2.02 A was not found, being 
covered by the very intense diamond peak, which occurs at the same d 
spacing (the single distance between two atomic lattice planes belong- 
ing to a family of infinite lattice planes all equidistant and parallel). The 
positions of these peaks (that is, the d spacing) and, in particular, the 
precisely measured relative order of intensities, detected by charge- 
coupled device (CCD), confirm the identity of the inclusion as ring- 
woodite but do not allow an accurate compositional estimate. 

Micro-Raman spectra of the inclusion (Fig. 1, grey traces) allowed 
ringwoodite to be identified by the two intense Raman bands that form 
a doublet corresponding to the asymmetric (T2,) and symmetric (Aj,) 
stretching vibrations of SiO, tetrahedra and which occur in the spectral 
regions ~807 and 860cm ', respectively. We refer to these bands as 
DB1 and DB2, respectively. The spacing of these two bands is 30% 
wider than those present in olivine, and DB1 is displaced to signifi- 
cantly lower wavenumbers. Band DB1 in JUc29 is defined from peak 
fitting to be located between 807 and 809 cm ~ | with DB2 between 854 
and 860 cm’. The increase in wavenumber of both DB1 and DB? rela- 
tive to the reference spectrum in Fig. 1 (red trace) and other synthetic 
ringwoodites is due largely to the influence of the compressive stress 
developed around the inclusion. This stress results from the difference 
in the volume expansion of the inclusion relative to the diamond that 
has helped to preserve the ringwoodite. All JUc29 Raman spectra show 
significant broadening of these SiO, stretching vibrations. This broad- 
ening is probably due to increased disordering resulting from a tend- 
ency for ringwoodite to revert to olivine at lower pressure, and hampers 
the use of the doublet band separation in estimating the composition of 
the ringwoodite. Nevertheless, an estimate of the composition can be 
attempted, on the basis of the shift in DB1 in response to pressure and 
increasing Fe in the structure, which have opposite effects (see Methods 
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Figure 1 | Raman spectra of ringwoodite and walstromite inclusions in 
Juina diamond JUc29. Raman spectra (unsmoothed, background-subtracted, 
in relative intensity units, stacked for clarity, shown in grey) for two-phase 
inclusion within JUc29 diamond, Juina. Spectra are complex, displaying 
SiO, stretching modes for ringwoodite ([Mg,Fe]SiO.) that are broadened, 
probably by disordering induced by incipient retrogression, as well as the 
characteristic modes for Ca-walstromite (CaSiO3). Reference spectra for olivine 
(blue), ringwoodite (red) and CaSiO3-walstromite (green) are from refs 20, 33. 


section on Raman spectroscopy). The compressive stress imposed on 
the inclusion was estimated by measuring the Raman shift of the main 
diamond band in the immediately adjacent diamond (1,337 cm '), 
which yields internal pressures of between 1.7 and 2.3 GPa depending 
on the pressure calibration of the Raman shift used (see Methods as 
above). Our estimate for the resulting phase composition yields a Mg 
number, Mg# = 100Mg/(Mg+Fe), of 7573, where the uncertainty is 
dominated by the uncertainty in the confining pressure, the exact 
position of DB1 and the calibration of DB1’s position with composition 
(see Methods as above). Although the compositional uncertainty is large, 
the presence of significant Fe in the structure is consistent with the 
confocal X-ray fluorescence data (Extended Data Fig. 2). 

Additional Raman-active bands at 662, 990 and 1,050cm ' are 
present in the JUc29 spectra and can be attributed to the presence of 
CaSiO3-walstromite (Fig. 1) adjacent to ringwoodite. Spectrum JUc29v 
sampled only the Ca-rich phase and is spectrally very similar to refer- 
ence CaSiO3-walstromite (Fig. 1, green trace). 
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Fourier transform infrared (FTIR) spectra for the inclusion reveal a 
pronounced OH “ stretching vibration with the band centre between 
3,150 and 3,200 cm ' (Fig. 2). The broad band at 3,150 cm 1 and that 
at 3,680cm ‘ correspond to OH” stretching modes reported in syn- 
thetic hydrous ringwoodite’”~*'. The correspondence between the gen- 
eral form of the JUc29 FTIR spectra and that of synthetic hydrous 
ringwoodite*’’ >’, together with the location of the main OH“ stretch- 
ing band at considerably lower wavenumber than either hydrous oliv- 
ine or wadsleyite strongly support the identification of our inclusion as 
not only ringwoodite, but ringwoodite containing significant water. 
The location of the main OH” band at between 3,160 and 3,180cm ! 
seems to support a composition between Mg#60 and Mg# 100 (see 
Methods section on FTIR spectroscopy), and is hence consistent with 
the Raman estimate. 

The phase assemblage presented by the inclusion pair can be used to 
constrain their likely depth of origin. Two scenarios are possible, indi- 
cative of different depths of mantle sampling. Ca-walstromite is stable, 
along a mantle geotherm, at or below 10 GPa (refs 20, 26), where ring- 
woodite with Mg# ~75 must coexist with olivine in a two-phase loop’. 
Although the peak broadening of the main doublet in some of the ring- 
woodite Raman spectra (for example JUc29v; Fig. 1) indicates the pos- 
sibility of partial retrogression to olivine in parts of the crystal, there is 
no indication of a highly crystalline olivine phase from the X-ray mea- 
surement. Hence, we interpret this phenomenon as disorder induced 
during the incipient breakdown of ringwoodite to olivine. Given this, 
the most likely interpretation of this two-phase assemblage is that it 
represents a partly retrogressed portion of a somewhat Fe-rich peri- 
dotitic mantle, in which hydrous ringwoodite and former CaSiO3- 
perovskite coexisted above 15GPa’’, that is, in the transition zone, 
probably with majorite garnet. The ringwoodite has largely avoided 
retrogression, whereas the CaSiO3-perovskite precursor reverted to Ca- 
walstromite. The slightly more Fe-rich composition of the ringwoodite 
may arise by reaction between the peridotitic and basaltic portions of a 
subducted slab** and may not be indicative of the bulk of the transition 
zone because of the resulting broadening of the 410-km seismic dis- 
continuity that would be seen at such Fe-rich compositions”. 

It is important to constrain the amount of water in the ringwoodite 
inclusion because this has implications for the water content of the 
transition zone. From experiments, ringwoodite may incorporate up 
to 2.5 wt% H,O under transition-zone conditions”**°*". The difficul- 
ties in constraining sample thickness during FTIR measurement, espe- 
cially in determining whether the beam was sampling part of the 
Ca-walstromite inclusion, plus any spectral absorption by the rather 
impure diamond host, make the estimation of the ringwoodite water 
content subject to large uncertainty. The main OH stretching band 
at ~3,150cm ' in hydrous ringwoodite becomes more pronounced 
with increasing H,O content, up to ~0.8 wt% H,O (ref. 4; Fig. 2). The 
JUc29 spectra show strong OH absorption, clearly indicative of sig- 
nificant H,O content, and are consistent with a minimum estimate 
between 1.4 and 1.5 wt% HO, derived by integrating the spectra in 
Fig. 2 (see Methods section on FTIR spectroscopy). Although the un- 
certainty in these estimates may be as large as 50%, we note that in 
synthetic ringwoodites containing 2 wt% H,O or more, the satellite 
OH" stretching mode at 3,645-3,680 cm _' transforms from a broad 
shoulder to a sharply defined vibrational band”°. This stretching mode 
is well defined in the JUc29 inclusion, supporting our calculated water 
concentration as a minimum estimate. 

Two main scenarios arise from the water-rich nature of the ring- 
woodite inclusion coming from transition-zone depths. In one, water 
within the ringwoodite reflects inheritance from a hydrous, diamond- 
forming fluid, from which the inclusion grew as a syngenetic phase. In 
this model, the hydrous fluid must originate locally, from the trans- 
ition zone, because there is no evidence that the lower mantle contains 
a significant amount of water. Alternatively, the ringwoodite is ‘pro- 
togenetic’, that is, it was present before encapsulation by the diamond 
and its water content reflects that of the ambient transition zone. Both 
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Figure 2 | FTIR spectra of ringwoodite inclusion in Juina diamond JUc29. 
a, Unpolarized FTIR spectra for ringwoodite inclusion in diamond JUc29 
between 2,200 and 3,900cm '. The two spectra were measured at ~90° 
degrees to each other and are unsmoothed, but were corrected for a background 
that includes the intrinsic response of the host diamond. Water contents 
calculated by integration of these two spectra are between 1.4 and 1.5 wt% 
(see Methods section on FTIR). b, Reference spectra for hydrous Fe-bearing 
ringwoodite (Mg# 89) containing ~1 wt% HO (ref. 27). 


models implicate a transition zone that is at least locally water-rich. It 
is interesting to explore the protogenetic option further to see what 
bounds would be placed on the bulk transition-zone water content in 
the light of geophysical observations. 

Using a conservative estimate of the H,O content of JUc29 ring- 
woodite, of 1.4 wt%, combined with mineral mode estimates’? and water 
solubilities for majorite and Ca-perovskite’®, results in a bulk water con- 
tent of ~1.0 wt% for the transition zone sampled by our diamond. This 
value is broadly aligned with the highest transition-zone water contents 
estimated from electromagnetic data’’*’. Other studies of ultradeep 
diamonds have indicated the transition zone could contain stagnated 
subducted slabs'’-”° that may transport water to this mantle region. The 
presence of hydrous ringwoodite in a diamond from transition-zone 
depths supports the view that high fluid activity, notably that of water, 
has a key role in the genesis of ultradeep diamonds” and is consistent 
with the proposal of regionally localized ‘wet-spots’ in the transition 
zone*” that may host thin melt layers above the 410-km discontinuity’. 
Our observations provide clear support for experimental measure- 
ments” showing that the P- and S-wave velocities of the lower trans- 
ition zone are consistent with a hydrated ringwoodite-rich composition. 
The preservation of ringwoodite within diamond also provides a strong 
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indication that some kimberlites must come from at least transition- 
zone depths. 


METHODS SUMMARY 


Full descriptions of all analytical methods and calculations of the compositional 
estimates are provided in Methods. Micro-X-ray fluorescence measurements were 
performed at beamline L of the DORIS-III synchrotron facility at HASYLAB (DESY, 
Germany). Measurements were made using confocal detection of an internal micro- 
scopic volume element of approximately 22 jim X 22 jim X 16 jim (full-width at 
half-maximum). Single-crystal X-ray diffraction was performed at the Diparti- 
mento di Geoscienze, Universita di Padova, Italy using a CCD detector coupled 
to a STOE STADIIV single-crystal diffractometer, via monochromatized Moxy 
radiation (A = 0.71073 A), working at 50 kV and 40 mA and with an exposure time 
of 60s. We obtained the main four diffraction peaks of ringwoodite (RINGW: 
Extended Data Fig. 3), that is, the planes (113) at 2.44 A, (440) at 1.40 A, (220) at 
2.81 A and (115) at 1.51 A, in the expected order of relative intensity. Raman spec- 
troscopy was carried out at the Geoscience Institute, Goethe University, Germany, 
using a Renishaw micro-Raman spectrometer (RM-1000) equipped with a Leica 
DMLM optical microscope and CCD detector. Spectra were excited with the He- 
Ne 632.8-nm line (max 50 mW). The wavenumber accuracy was 0.5cm_' and the 
spectral resolution was ~1 cm” '. The lateral resolution at the sampling depth was 
several micrometres and the depth resolution was several tens of micrometres. 
Details of the calculation of the ringwoodite composition from the Raman spectra 
are given in Methods. FTIR spectra were obtained with a Nicolet Continujum in- 
frared microscope attached to a Thermo Nicolet Nexus 470 FTIR Spectrometer 
at the De Beers Laboratory of Diamond Research, University of Alberta, Canada. 
All measurements were performed in transmitted mode, with an unpolarized beam 
of aperture size 70 jim. Two hundred scans were acquired with a spectral resolu- 
tion 4cm”'. Details of the calculation of water content from the spectra are given 
in Methods. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 


Received 16 September 2013; accepted 21 January 2014. 


1. Smyth, J. R. B-Mg2SiOx: a potential host for water in the mantle? Am. Mineral. 72, 
1051-1055 (1987). 

2. Chen, J., Inoue, T., Yurimoto, H. & Weidner, D. J. Effect of water on olivine- wadsleyite 
phase boundary in the (Mg,Fe)2SiO,4 system. J. Geophys. Res. Lett. 29, 1875 
(2002). 

3. Kohlstedt, D.L., Keppler, H. & Rubie, D. C. Solubility of water in the a, B and y phases 
of (Mg,Fe)2SiO,. Contrib. Mineral. Petrol. 123, 345-357 (1996). 

4. Smyth, J. R. et al. Structural systematics of hydrous ringwoodite and water in 
Earth’s interior. Am. Mineral. 88, 1402-1407 (2003). 

5. Bercovici, D. & Karato, S. Whole-mantle convection and the transition zone water 
filter. Nature 425, 39-44 (2003). 

6. Bolfan-Casanova, N. Water in the Earth’s mantle. Mineral. Mag. 69, 229-258 
(2005). 

7. Hirschmann, M. Water, melting and the deep Earth H20 cycle. Annu. Rev. Earth 
Planet. Sci. 34, 629-653 (2006). 

8. Huang, X., Xu, Y. & Karato, S. Water content in the transition zone from 

electrical conductivity of wadsleyite and ringwoodite. Nature 434, 746-749 

(2005). 

9. Xu, Y., Shankland, T. J. & Rubie, D. C. Electrical conductivity of olivine, wadsleyite 

and ringwoodite under upper-mantle conditions. Science 280, 1415-1418 

(1998). 

Yoshino, T., Manthilake, G., Matsuzaki, T. & Katsura, T. Dry mantle transition zone 

inferred from the conductivity of wadsleyite and ringwoodite. Nature 451, 

326-329 (2008). 

11. Khan, A. & Shankland, T. J. A geological perspective on mantle water content and 

melting: inverting electromagnetic sounding data using laboratory-based 

electrical conductivity profiles. Earth Planet. Sci. Lett. 317-318, 27-43 (2012). 

12. Agee, C. B. in Ultrahigh-Pressure Mineralogy (ed. Hemley, R. J.) 165-203 
(Rev. Mineral. 37, Mineralogical Society of America, 1998). 

13. Ringwood, A. E. & Major, A. The system Mg2Si0,-Fe2SiO, at high pressures and 
temperatures. Phys. Earth Planet. Inter. 3, 89-108 (1970). 

14. Ye, Y. et al. Compressibility and thermal expansion of hydrous ringwoodite with 
2.5 wt% H20. Am. Mineral. 97, 573-582 (2012). 

15. Harris, J.W., Hutchison, M. T., Hursthouse, M., Light, M. & Harte, B. A new tetragonal 
silicate mineral occurring as inclusions in lower mantle diamonds. Nature 387, 
486-488 (1997). 

16. Harte, B. & Harris, J. W. Lower mantle associations preserved in diamonds. 
Mineral. Mag. 58A, 384-385 (1994). 

17. Harte, B., Harris, J. W., Hutchison, M. T., Watt, G. R. & Wilding, M. C. in Mantle 
Petrology: Field Observations and High Pressure Experimentation Vol. 6 (eds Fei, Y. & 
Bertka, C. M.) 125-153 (Geochem. Soc. Spec. Publ., The Geochemical Society, 
1999). 


10. 


13 MARCH 2014 | VOL 507 | NATURE | 223 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


18. Hutchison, M. T., Cartigny, P. & Harris, J. W. in Proc. 7th Int. Kimberlite Conf. (eds 
Gurney, J. J., Gurney, J. L., Pascoe, M. D. & Richardson, S. H.) 372-382 (Red Roof 
Design, 1999). 

19. Hutchison, M. T., Hursthouse, M. B. & Light, M. E. Mineral inclusions in diamonds: 
associations and chemical distinctions around the 670-km discontinuity. Contrib. 
Mineral. Petrol. 142, 119-126 (2001). 

20. Brenker, F. etal. Detection of a Ca-rich lithology in the deep (>300 km) convecting 
mantle. Earth Planet Sci. Lett 236, 579-587 (2005). 

21. Stachel, T., Brey, G. P. & Harris, J. W. Inclusions in sublithospheric diamonds: 
glimpses of deep Earth. Elements 1, 73-78 (2005). 

22. Stachel, T., Harris, J.W., Brey, G. P. &Joswig, W. Kankan diamonds (Guinea) II: lower 
mantle inclusion paragenesis. Contrib. Mineral. Petrol. 140, 16-27 (2000). 

23. Bulanova, G. et al. Mineral inclusions in sublithospheric diamonds from Collier 4 
kimberlite pipe, Juina, Brazil: subducted protoliths, carbonated melts and 
primary kimberlite magmatism. Contrib. Mineral. Petrol. 160, 489-510 (2010). 

24. Hayman, P. C., Kopylova, M. G. & Kaminsky, F. V. Lower mantle diamonds from 
Rio Soriso (Juina area, Mato Grosso, Brazil). Contrib. Mineral. Petrol. 149, 430-445 
(2005). 

25. Tappert, R., Stachel, T., Harris, J. W., Shimizu, N. & Brey, G. P. Mineral inclusions in 
diamonds from the Panda kimberlite, Slave Province, Canada. Eur. J. Mineral. 17, 
423-440 (2005). 

26. Harte, B. Diamond formation in the deep mantle: the record of mineral inclusions 
and their distribution in relation to mantle dehydration zones. Mineral. Mag. 74, 
189-215 (2010). 

27. Jacobsen, S. D., Smyth, J. R., Spetzler, H., Holl, C. M. & Frost, D. J. Sound velocities 
and elastic constants of iron-bearing hydrous ringwoodite. Phys. Earth Planet. Inter. 
143-144, 47-56 (2004). 

28. Blanchard, M., Balan, E. & Wright, K. Incorporation of water in iron-free 
ringwoodite: a first principles study. Am. Mineral. 94, 83-89 (2009). 

29. Bolfan-Casanova, N., Keppler, H. & Rubie, D. C. Water partitioning between 
nominally anhydrous minerals in the MgO-SiO2-H20 system up to 24 GPa: 
implications for the distribution of water in the Earth’s mantle. 

Earth Planet. Sci. Lett. 182, 209-221 (2000). 


224 | NATURE | VOL 507 | 13 MARCH 2014 


30. Keppler, H. & Bolfan-Casanova, N. in Water in Nominally Anhydrous Minerals 
(eds Keppler, H. & Smyth, J. R.) 193-230 (Rev. Mineral. 62, Mineralogical Society of 
America, 2006). 

31. Kleppe, A. K., Jephcoat, A. P. & Smyth, J. R. Raman spectroscopic study of 
hydrous y-Mg2SiO, to 56.5 GPa. Phys. Chem. Miner. 29, 473-476 
(2002). 

32. Kelbert, A., Schultz, A. & Egbert, G. Global electromagnetic induction 
constraints on transition-zone water content variations. Nature 460, 1003-1006 
(2009). 

33. Chen, M., El Goresy, A. & Gillet, P. Ringwoodite lamellae in olivine: clues to olivine- 
ringwoodite phase transition mechanisms in shocked meteorites and subducting 
slabs. Proc. Natl Acad. Sci. USA 101, 15033-15037 (2004). 


Acknowledgements D.G.P. acknowledges CERC funding for this study. F.N. is 
supported by ERC Starting Grant 307322. Support from the Alfred P. Sloan 
Foundation’s Deep Carbon Observatory project created this research partnership. We 
thank T. Stachel for comments on the manuscript plus access to the FTIR instrument at 
the De Beers Laboratory of Diamond Research at the University of Alberta, and we 
thank J. Harris for discussions. Sample JUc29 was provided by Trigon GeoServices Ltd. 


Author Contributions D.G.P. had the idea for the study, wrote the manuscript and 
helped perform the Raman and FTIR measurements. F.E.B. performed the Raman 
measurements and ion-milling and made compositional estimates. F.N. performed 
X-ray measurements. J.M. and LN. first identified the inclusion as ringwoodite. M.T.H. 
selected the diamond for this study and assisted with manuscript preparation and 
geological background. S.M. performed the FTIR measurements and the water content 
estimate. K.M. assisted with manuscript preparation. G.S., S.S., B.V. and L.V. performed 
the synchrotron X-ray mapping measurements. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to D.G.P. (gdpearso@ualberta.ca). 


©2014 Macmillan Publishers Limited. All rights reserved 


Mees Tea 


doi:10.1038/nature12960 


Derived immune and ancestral pigmentation alleles 
in a 7,000-year-old Mesolithic European 


Inigo Olalde!*, Morten E. Allentoft?*, Federico Sanchez-Quinto!, Gabriel Santperel, Charleston W. K. Chiang’, 

Michael DeGiorgio*”, Javier Prado-Martinez', Juan Antonio Rodriguez’, Simon Rasmussen®, Javier Quilez!, Oscar Ramirez!, 
Urko M. Marigorta', Marcos Fernandez-Callejo!, Maria Encina Prada’, Julio Manuel Vidal Encinas®, Rasmus Nielsen’, 

Mihai G. Netea’®, John Novembre", Richard A. Sturm’, Pardis Sabeti!?“, Tomas Marqués-Bonet!*, Arcadi Navarrob}>)!617, 


Eske Willerslev? & Carles Lalueza-Fox! 


Ancient genomic sequences have started to reveal the origin and the 
demographic impact of farmers from the Neolithic period spread- 
ing into Europe’ ’. The adoption of farming, stock breeding and 
sedentary societies during the Neolithic may have resulted in adapt- 
ive changes in genes associated with immunity and diet*. However, 
the limited data available from earlier hunter-gatherers preclude 
an understanding of the selective processes associated with this cru- 
cial transition to agriculture in recent human evolution. Here we 
sequence an approximately 7,000-year-old Mesolithic skeleton dis- 
covered at the La Brafia-Arintero site in Leon, Spain, to retrieve a 
complete pre-agricultural European human genome. Analysis of 
this genome in the context of other ancient samples suggests the 
existence of a common ancient genomic signature across western 
and central Eurasia from the Upper Paleolithic to the Mesolithic. 
The La Braiia individual carries ancestral alleles in several skin pig- 
mentation genes, suggesting that the light skin of modern Europeans 
was not yet ubiquitous in Mesolithic times. Moreover, we provide 
evidence that a significant number of derived, putatively adaptive 
variants associated with pathogen resistance in modern Europeans 
were already present in this hunter-gatherer. 

Next-generation sequencing (NGS) technologies are revolution- 
izing the field of ancient DNA (aDNA), and have enabled the sequen- 
cing of complete ancient genomes”®, such as that of Otzi, a Neolithic 
human body found in the Alps'. However, very little is known of the 
genetic composition of earlier hunter-gatherer populations from the 
Mesolithic period (circa 10,000-5,000 years before present, BP; imme- 
diately preceding the Neolithic period). 

The Iberian site called La Brafta-Arintero was discovered in 2006 
when two male skeletons (named La Brafia 1 and 2) were found in a 
deep cave system, 1,500 m above sea level in the Cantabrian mountain 
range (Leon, Northwestern Spain) (Fig. 1a). The skeletons were dated 
to approximately 7,000 years BP (7,940-7,690 calibrated Bp)’. Because 
of the cold environment and stable thermal conditions in the cave, the 
preservation of these specimens proved to be exceptional (Fig. 1b). We 
identified a tooth from La Brafia 1 with high human DNA content (48.4%) 
and sequenced this specimen to a final effective genomic depth-of- 
coverage of 3.40 (Extended Data Fig. 1). 

We used several tests to assess the authenticity of the genome sequence 
and to determine the amount of potential modern human contamina- 
tion. First, we observed that sequence reads from both the mitochondrial 
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Figure 1 | Geographic location and genetic affinities of the La Braiia 1 
individual. a, Location of the La Brafia-Arintero site (Spain). b, The La Brana 1 
skeleton as discovered in 2006. c, PCA based on the average of the Procrustes 
transformations of individual PCAs with La Brafia 1 and each of the five 
Neolithic samples’. The reference populations are the Finnish HapMap, 
FINHM and POPRES. Population labels with labelling of ref. 12 with the 
addition of FI (Finns) or LFI (late-settlement Finns). Ajv70, Ajv52, Ire8 and 
Gok4 are Scandinavian Neolithic hunter-gatherers and a farmer, respectively’. 
Otzi is the Tyrolean Ice Man. 
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DNA (mtDNA) and the nuclear DNA of La Brafia 1 showed the typical 
ancient DNA misincorporation patterns that arise from degradation of 
DNA over time’ (Extended Data Fig. 2a, b). Second, we showed that the 
observed number of human DNA fragments was negatively correlated 
with the fragment length (R? > 0.92), as expected for ancient degraded 
DNA, and that the estimated rate of DNA decay was low and in 
agreement with predicted values’ (Extended Data Fig. 2c, d). We then 
estimated the contamination rate in the mtDNA genome, assembled to 
a high depth-of-coverage (91), by checking for positions differing 
from the mtDNA genome (haplogroup U5b2c1) that was previously 
retrieved with a capture method’. We obtained an upper contamina- 
tion limit of 1.69% (0.75-2.6%, 95% confidence interval, CI) (Supplemen- 
tary Information). Finally, to generate a direct estimate of nuclear DNA 
contamination, we screened for heterozygous positions (among reads 
with >4X coverage) in known polymorphic sites (Single Nucleotide 
Polymorphism Database (dbSNP) build 137) at uniquely mapped sec- 
tions on the X chromosome’ (Supplementary Information). We found 
that the proportion of false heterozygous sites was 0.31%. Together 
these results suggest low levels of contamination in the La Brafa 1 
sequence data. 

To investigate the relationship to extant European samples, we con- 
ducted a principal component analysis (PCA)’° and found that the 
approximately 7,000-year-old Mesolithic sample was divergent from 
extant European populations (Extended Data Fig. 3a, b), but was placed 
in proximity to northern Europeans (for example, samples from Sweden 
and Finland)'*"*. Additional PCAs and allele-sharing analyses with 
ancient Scandinavian specimens’ supported the genetic similarity of 
the La Brafia 1 genome to Neolithic hunter-gatherers (Ajv70, Ajv52, 
Ire8) relative to Neolithic farmers (Gok4, Otzi) (Fig. 1c, Extended Data 
Figs 3c and 4). Thus, this Mesolithic individual from southwestern 
Europe represents a formerly widespread gene pool that seems to be 
partially preserved in some modern-day northern European popula- 
tions, as suggested previously with limited genetic data*’. We subse- 
quently explored the La Brana affinities to an ancient Upper Palaeolithic 
genome from the Mal'ta site near Lake Baikal in Siberia’*. Outgroup f; 
and D statistics'®”’, using different modern reference populations, sup- 
port that Mal’ta is significantly closer to La Brafa 1 than to Asians or 
modern Europeans (Extended Data Fig. 5 and Supplementary Infor- 
mation). These results suggest that despite the vast geographical dis- 
tance and temporal span, La Brafia 1 and Mal’ta share common genetic 
ancestry, indicating a genetic continuity in ancient western and central 
Eurasia. This observation matches findings of similar cultural artefacts 
across time and space in Upper Paleolithic western Eurasia and Siberia, 
particularly the presence of anthropomorphic ‘Venus’ figurines that 
have been recovered from several sites in Europe and Russia, including 
the Mal’ta site’*. We also compared the genome-wide heterozygosity of 


the La Brana 1 genome to a data set of modern humans with similar 
coverage (3-4). The overall genomic heterozygosity was 0.042%, 
lower than the values observed in present day Asians (0.046-0.047%), 
Europeans (0.051-0.054%) and Africans (0.066-0.069%) (Extended 
Data Fig. 6a). The effective population size, estimated from heterozyg- 
osity patterns, suggests a global reduction in population size of approxi- 
mately 20% relative to extant Europeans (Supplementary Information). 
Moreover, no evidence of tracts of autozygosity suggestive of inbreed- 
ing was observed (Extended Data Fig. 6b). 

To investigate systematically the timing of selection events in the 
recent history of modern Europeans, we compared the La Brafia gen- 
ome to modern populations at loci that have been categorized as of 
interest for their role in recent adaptive evolution. With respect to two 
recent well-studied adaptations to changes in diet, we found the ancient 
genome to carry the ancestral allele for lactose intolerance* and approxi- 
mately five copies of the salivary amylase (AMY 1) gene (Extended Data 
Fig. 7 and Supplementary Information), a copy number compatible 
with a low-starch diet'*®. These results suggest the La Brafa hunter- 
gatherer was poor at digesting milk and starch, supporting the hypo- 
theses that these abilities were selected for during the later transition to 
agriculture. 

To expand the survey, we analysed a catalogue of candidate signals 
for recent positive selection based on whole-genome sequence vari- 
ation from the 1000 Genomes Project’’, which included 35 candidate 
non-synonymous variants, ten of which were detected uniquely in the 
CEU (Utah residents with northern and western European ancestry) 
sample '’. For each variant we assessed whether the Mesolithic genome 
carried the ancestral or derived (putatively adaptive) allele. 

Of the ten variants, the Mesolithic genome carried the ancestral and 
non-selected allele as a homozygote in three regions: C12orf29 (a gene with 
unknown function), SLC45A2 (1816891982) and SLC24A5 (181426654) 
(Table 1). The latter two variants are the two strongest known loci 
affecting light skin pigmentation in Europeans*’” and their ancestral 
alleles and associated haplotypes are either absent or segregate at very 
low frequencies in extant Europeans (3% and 0% for SLC45A2 and 
SLC24A5, respectively) (Fig. 2). We subsequently examined all genes 
known to be associated with pigmentation in Europeans”, and found 
ancestral alleles in MC1R, TYR and KITLG, and derived alleles in 
TYRP1, ASIP and IRF4 (Supplementary Information). Although the 
precise phenotypic effects cannot currently be ascertained in a European 
genetic background, results from functional experiments” indicate that 
the allelic combination in this Mesolithic individual is likely to have 
resulted in dark skin pigmentation and dark or brown hair. Further 
examination revealed that this individual carried the HERC2 rs12913832*C 
single nucleotide polymorphism (SNP) and the associated homozygous 
haplotype spanning the HERC2-OCA2 locus that is strongly associated 


Table 1 | Mesolithic genome allelic state at 10 nonsynonymous variants recently selected in Europeans 


Allelic state Gene Name SNP Amino-acid change Function 
La Brafia 1 carries the PTX4 Pentraxin 4 rs2745098 Arg281Lys May be involved in innate 
derived allele immunity 
UHRF1BP1 UHRF1 binding protein 1 rs11755393 Gin454Arg Risk locus for systemic 
lupus erythematosus 
GPATCH1 G patch domain containing 1 rs10421769 Leu520Ser Receptor for OmpA expressed 
by E. coli 
WWOX WW domain-containing oxidoreductase rs 12918952 Alal79Thr Acts as a tumour suppressor and 
has a role in apoptosis 
CCDC14 Coiled-coil domain-containing protein rs17310144 Thr365Pro Unknown 
14 
La Brafia 1 carries both SETX Senataxin rs1056899 Val2587Ille Involved in spinocerebellar ataxia 
the ancestral and the and amyotrophic lateral sclerosis 
derived allele 
TDRD12 Tudor domain containing 12 rs11881633 Glu413Lys Unknown 
La Brafia 1 retains the C120rf29 Chromosome 12 open reading frame 29 rs9262 Val238Leu Unknown 
ancestral allele 
SLC45A2 Solute carrier family 45, member 2 rs16891982 Leu374Phe Associated with skin pigmentation 
SLC24A5 Solute carrier family 24, member 5 rs1426654 Alal11Thr — Associated with skin pigmentation 
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Figure 2 | Ancestral variants around the SLC45A2 (rs16891982, above) and 
SLC24A5 (181426654, below) pigmentation genes in the Mesolithic genome. 
The SNPs around the two diagnostic variants (red arrows) in these two genes 
were analysed. The resulting haplotype comprises neighbouring SNPs that are 


with blue eye colour’. Moreover, a prediction of eye colour based on 
genotypes at additional loci using HIrisPlex™* produced a 0.823 maximal 
and 0.672 minimal probability for being non-brown-eyed (Supplemen- 
tary Information). The genotypic combination leading to a predicted 
phenotype of dark skin and non-brown eyes is unique and no longer 
present in contemporary European populations. Our results indicate 
that the adaptive spread of light skin pigmentation alleles was not 
complete in some European populations by the Mesolithic, and that 
the spread of alleles associated with light/blue eye colour may have 
preceded changes in skin pigmentation. 

For the remaining loci, La Brafia 1 displayed the derived, putatively 
adaptive variants in five cases, including three genes, PTX4, UHRFIBP1 
and GPATCH1 (ref. 19), involved in the immune system (Table 1 and 
Extended Data Fig. 8). GPATCH1 is associated with the risk of bacterial 
infection. We subsequently determined the allelic states in 63 SNPs 
from 40 immunity genes with previous evidence for positive selection 
and for carrying polymorphisms shown to influence susceptibility to 
infections in modern Europeans (Supplementary Information). La 
Brafia 1 carries derived alleles in 24 genes (60%) that have a wide range 
of functions in the immune system: pattern recognition receptors, 
intracellular adaptor molecules, intracellular modulators, cytokines 
and cytokine receptors, chemokines and chemokine receptors and 
effector molecules. Interestingly, four out of six SNPs from the first 
category are intracellular receptors of viral nucleic acids (TLR3, TLR8, 
IFIH1 (also known as MDA5) and LGP2)”’. 

Finally, to explore the functional regulation of the genome, we also 
assessed the La Brana 1 genotype at all expression quantitative trait loci 
(eQTL) regions associated to positive selection in Europeans (Sup- 
plementary Information). The most interesting finding is arguably the 
predicted overexpression of eight immunity genes (36% of those with 
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also absent in modern Europeans (CEU) (n=112) but present in Yorubans 
(YRI) (n=113). This pattern confirms that the La Brafia 1 sample is older than 
the positive-selection event in these regions. Blue, ancestral; red, derived. 


described eQTLs), including three Toll-like receptor genes (TLR1, TLR2 
and TLR4) involved in pathogen recognition”®. 

These observations suggest that the Neolithic transition did not drive 
all cases of adaptive innovation on immunity genes found in modern 
Europeans. Several of the derived haplotypes seen at high frequency 
today in extant Europeans were already present during the Mesolithic, 
as neutral standing variation or due to selection predating the Neolithic. 
De novo mutations that increased in frequency rapidly in response to 
zoonotic infections during the transition to farming should be iden- 
tified among those genes where La Brafia 1 carries ancestral alleles. 

To confirm whether the genomic traits seen at La Brafia 1 can be 
generalized to other Mesolithic populations, analyses of additional ancient 
genomes from central and northern Europe will be needed. Nevertheless, 
this genome sequence provides the first insight as to how these hunter- 
gatherers are related to contemporary Europeans and other ancient 
peoples in both Europe and Asia, and shows how ancient DNA can shed 
light on the timing and nature of recent positive selection. 


METHODS SUMMARY 


DNA was extracted from the La Brafia 1 tooth specimen with a previously pub- 
lished protocol’. Indexed libraries were built from the ancient extract and sequenced 
on the Illumina HiSeq platform. Reads generated were mapped with BWA” to the 
human reference genome (NCBI 37, hg19) after primer trimming. A metagenomic 
analysis and taxonomic identification was generated with the remaining reads 
using BLAST 2.2.27+ and MEGAN4 (ref. 28) (Extended Data Fig. 9). SNP calling 
was undertaken using a specific bioinformatic pipeline designed to account for 
ancient DNA errors. Specifically, the quality of misincorporations likely caused by 
ancient DNA damage was rescaled using the mapDamage2.0 software”, and a set 
of variants with a minimum read depth of 4 was produced with GATK”. Analyses 
including PCA”®, Outgroup f;’° and D statistics’’ were performed to determine the 
population affinities of this Mesolithic individual (Supplementary Information). 
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doublesex is a mimicry supergene 


K. Kunte!*, W. Zhang**, A. Tenger-Trolander’, D. H. Palmer’, A. Martin*, R. D. Reed’, S. P. Mullen? & M. R. Kronforst”? 


One of the most striking examples of sexual dimorphism is sex- 
limited mimicry in butterflies, a phenomenon in which one sex— 
usually the female—mimics a toxic model species, whereas the other 
sex displays a different wing pattern’. Sex-limited mimicry is phy- 
logenetically widespread in the swallowtail butterfly genus Papilio, 
in which it is often associated with female mimetic polymorphism’ ”. 
In multiple polymorphic species, the entire wing pattern pheno- 
type is controlled by a single Mendelian ‘supergene’. Although 
theoretical work has explored the evolutionary dynamics of super- 
gene mimicry*’, there are almost no empirical data that address 
the critical issue of what a mimicry supergene actually is at a func- 
tional level. Using an integrative approach combining genetic and 
association mapping, transcriptome and genome sequencing, and 
gene expression analyses, we show that a single gene, doublesex, 
controls supergene mimicry in Papilio polytes. This is in contrast 
to the long-held view that supergenes are likely to be controlled by a 
tightly linked cluster of loci*. Analysis of gene expression and DNA 
sequence variation indicates that isoform expression differences 
contribute to the functional differences between dsx mimicry alleles, 
and protein sequence evolution may also have a role. Our results 
combine elements from different hypotheses for the identity of super- 
genes, showing that a single gene can switch the entire wing pattern 
among mimicry phenotypes but may require multiple, tightly linked 
mutations to do so. 

Wing pattern mimicry in butterflies, a phenomenon in which natural 
selection by predators causes unrelated species to evolve similar wing 
patterns, has served as an important model for studying adaptation 
since the earliest days of modern evolutionary theory’®. Classical Batesian 
mimicry, in which an undefended mimic evolves to look like a toxic 
model, is a parasitic relationship in which the mimic gains an advant- 
age at the expense of the model. Such systems have well-characterized 
frequency dependence’”, sometimes resulting in sexual dimorphism 
and mimetic polymorphism’ **"*"*. Swallowtail butterflies in the genus 
Papilio are well-known Batesian mimics, providing some of the most 
extreme examples of sexual dimorphism and polymorphism among 
living organisms’*”’. For instance, in the species Papilio polytes, males 
all display the same non-mimetic wing pattern, whereas females dis- 
play either a male-like pattern (form cyrus) or one of several different 
patterns that mimic toxic species in the genus Pachliopta (Fig. 1). 
Female wing pattern is polymorphic in local areas and there are no 
intermediate forms. The early crossing experiments of Clarke and 
Sheppard’’ revealed that variation in the entire wing pattern, as well 
as the presence versus absence of hindwing ‘tails’, is controlled by a 
single Mendelian locus, with female polymorphism resulting from 
multiple alleles, each with its place in a dominance hierarchy. Clarke 
and Sheppard also showed that the mimicry locus is autosomal, so 
sexual dimorphism is not directly mediated by sex linkage in this case’’. 

This phenomenon, in which the entire wing pattern is controlled by 
a single Mendelian locus, is referred to as ‘supergene’ mimicry*. Because 
Clarke and Sheppard occasionally witnessed individuals with putatively 
recombinant wing patterns, they envisioned a supergene as a tightly 
linked cluster of loci, each controlling a distinct subset of the wing 


pattern. However, Clarke and Sheppard found virtually no evidence 
for recombination in P. polytes'’, although they did recover apparently 
recombinant phenotypes in other species, such as P. memnon™. Over 
the past few decades, supergene mimicry has received considerable theor- 
etical attention *, but there are almost no empirical data that address 
the molecular basis of a supergene. One example from Heliconius but- 
terflies, which involves supergene mimicry but not sexual dimorphism, 
suggests that supergenes may be the result of chromosomal inversions 
that lock multiple adjacent genes into a single, non-recombining unit’’. 
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Figure 1 | Polymorphic, sex-limited mimicry in Papilio polytes. Non- 
mimetic (form cyrus) females look like males, whereas mimetic female morphs 
(forms polytes, theseus and romulus) mimic distantly related, toxic Pachliopta 
swallowtails. The presence of hindwing tails on males and cyrus females is 
variable among P. polytes populations. Our analyses focused on P. polytes 
alphenor, a group lacking tails on non-mimetic butterflies, and presence versus 
absence of tails segregated perfectly with female wing pattern in our crosses. 
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Other evolutionary phenomena that involve supergene-like genetic 
architectures, such as self-incompatibility in plants and segregation 
distortion in Drosophila melanogaster, have also been traced back to 
multiple linked genes'®’’. A second possibility is that a master regulator 
could gain control of the distinct networks that pattern various aspects 
of the wing, and hence control the entire phenotype from a single locus. 
Although this single-gene hypothesis has been discussed**"’, there are 
no empirical data to support it. 

Using a multi-step genetic mapping process that involved rearing nine 
F, backcross families (Fig. 2a), bulk segregant analysis with restriction- 
site associated DNA (RAD) markers, screening and sequencing bac- 
terial artificial chromosome (BAC) clones, and fine-mapping, we mapped 
the mimicry locus in P. polytes back to a 300-kilobase (kb) region of the 
genome that contained five genes (Fig. 2b). We were intrigued to find 
that one of these genes was doublesex (dsx), a transcription factor in 
insects that controls somatic sex differentiation by alternative splicing’””°. 
In Drosophila, dsx is alternatively spliced into two isoforms: a male- 
specific form that leads to male sexual differentiation, and an alterna- 
tive female form that causes female sexual differentiation’? *'. In other 
insects, dsx functions the same way although there can be more than 
one male and female isoform”. 

On the basis of our mapping data and the known role of dsx in 
mediating sexual dimorphism****, we proposed that dsx might control 
both the sex-limited and female polymorphism components of P. polytes 
mimicry. To test this hypothesis, we generated a reference genome 
sequence across our target interval and performed comprehensive asso- 
ciation mapping by re-sequencing the genomes of 15 mimetic (form 
polytes) and 15 non-mimetic (form cyrus) butterflies (Extended Data 
Table 1). This yielded multiple perfect associations in dsx but only weak 
associations immediately outside of dsx (Fig. 2c). A separate genome- 
wide association study (GWAS) also yielded dsx as the top association 
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Figure 2 | Mapping the mimicry supergene. a, A series of nine backcross 
families yielded a total of 443 F, females that segregated 1:1 for female mimicry 
phenotype. b, Genome-wide mapping with RAD markers and subsequent fine- 
mapping localized the mimicry locus to a 300-kb interval containing five genes, 
one of which was doublesex (dsx). c, Association mapping, based on full genome 
sequences of 30 P. polytes butterflies, revealed multiple perfect associations 
inside dsx but none outside the gene. The positions of the 300-kb zero- 
recombinant interval and dsx are indicated. Data points represent false- 
discovery rate (FDR)-adjusted P values for a total of 94,776 SNPs. 
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hit (Extended Data Table 2). Long-term balancing selection, which 
maintains mimicry polymorphism**”, is expected to result in a loca- 
lized excess of nucleotide variation driven by the accumulation of neut- 
ral substitutions on alternative alleles*’. Analysis of DNA sequence 
variation revealed a highly significant excess of nucleotide polymorph- 
ism in dsx, relative to neighbouring genes (Table 1 and Extended Data 
Table 3), and comparisons between mimetic and non-mimetic indivi- 
duals revealed over 1,000 nucleotide substitutions differentiating mimetic 
and non-mimetic dsx alleles (Table 1). This is in contrast to all neigh- 
bouring genes, which show little polymorphism and no fixed differences 
between mimicry forms. 

The involvement of dsx as the mimicry supergene indicates a potential 
role for alternative splicing in the control of wing pattern. Transcriptome 
assembly based on wing-disc-derived RNA yielded three distinct female 
dsx isoforms and one male isoform (Fig. 3a). However, cloning and 
sequencing dsx isoforms from mimetic and non-mimetic males and 
females yielded the same repertoire of isoforms in butterflies with alter- 
native mimicry alleles. Comparisons of isoform expression using quan- 
titative reverse transcription PCR (qRT-PCR) revealed that all three 
female isoforms show strong female-biased expression (Fig. 3b—-d). Two 
of these, isoforms 1 and 2, further showed pronounced wing-biased 
expression, whereas the third female isoform had body-biased expres- 
sion (Fig. 3b-d). Comparisons between mimetic and non-mimetic 
females for wing-biased isoforms 1 and 2 revealed marked upregula- 
tion in mimetic females relative to non-mimetic females (Fig. 3e, f). 
This biased expression probably contributes to the functional differ- 
ence between mimicry alleles. Notably, expression of isoforms 1 and 2 
seems to increase at day 5 after pupation (Fig. 3g), a stage at which 
immunodetection of Dsx spatial expression on mimetic forewings 
revealed a marked spatial correspondence with adult wing pattern 
(Fig. 3h). 

Overall, our results indicate a surprising mode of action for dsx as a 
mimicry supergene. As a classic example of alternative splicing, our 
initial hypothesis was that alternative splicing would also underlie the 
phenotypic switch between female wing patterns. Although we do find 
clear evidence of alternative splicing, and different levels of isoform 
expression between female wing patterns, the set of female isoforms 
does not differ between groups. Rather, gene expression variation seems 
to have a central role in controlling mimicry polymorphism. Another 
striking feature of dsx in P. polytes is the large number of nucleotide 
substitutions that differ between mimicry alleles. The accumulation of 
neutral substitutions that is expected from balancing selection makes it 
difficult to infer which of these changes might be functionally related to 
mimicry polymorphism. However, we note that the proportion of fixed 
differences between cyrus and polytes haplotypes is over seven times 
greater in coding regions (72 out of 1,068 differences) compared to 
non-coding regions (972 out of 108,036 differences), and these coding 
region changes include 25 amino acid substitutions located primarily 
in the first exon (Table 1). The amino acid changes in exon 1 are clus- 
tered in two regions: the 5’ end of the protein, in front of the DNA 
binding (DM) domain, and the region between the DM domain and 
the dimerization domain; there are no amino acid changes in either 
domain (Extended Data Fig. 1). To explore the potential impact of these 
amino acid substitutions, we predicted secondary and tertiary struc- 
tures for both the cyrus and polytes Dsx proteins and found that they 
differ markedly—the non-mimetic cyrus protein folds much like other 
insects, such as Bombyx mori, whereas the mimetic polytes protein 
structure is highly divergent (Extended Data Fig. 2). In addition to 
the differential expression of female isoforms, we speculate that distinct 
Dsx protein structures may also contribute to female polymorphism, 
with alternative alleles differentially regulating different downstream 
targets as a result of divergent DNA or coactivator binding properties. 

How are a large number of nucleotide substitutions maintained in 
complete linkage disequilibrium over the approximately 100-kb length 
of dsx? Recombination between mimicry alleles in heterozygotes should 
break up dsx haplotypes, and the fact that we see many differences 
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Table 1 | DNA sequence variation in Papilio polytes near the mimicry supergene 


Gene Section Length (bp) Fixed synonymous/silent substitutions Fixed non-synonymous substitutions Total SNPs 
neuro ORF 519 0 ) 8 
clp ORF 1,443 0 ) 26 
ferm ORF 2,181 0 3) 17 
rad51 ORF 1,017 0 ) 2 
sir2 ORF 1,224 0 0 3 
dsx Exon 1 588 31 21 59 
Exon 2 144 5 ) 8 
Exon 3 84 0 ) 1 
Exon 4 69 2 I 3 
Exon 5 183 9 3 13 
Non-coding 108,036 972 NA 6,781 
pros ORF 2,895 0 ) 21 


Counts of synonymous/silent and non-synonymous nucleotide substitutions fixed between mimetic (polytes) and non-mimetic (cyrus) P. polytes butterflies in genes located near the mimicry supergene, as well as 
the total number of SNPs in each gene. Counts for dsx are separated by gene section (exons, non-coding) whereas counts for other genes represent predicted open reading frame (ORF). 


between mimicry alleles suggests that something is reducing recom- 
bination immediately around dsx. Chromosomal inversions are well 
known to reduce recombination in heterozygotes”*, making this a likely 
explanation. We first verified that the dsx region does indeed exhibit 
elevated linkage disequilibrium relative to adjacent regions (Extended 
Data Fig. 3), and then we searched for evidence of structural variation 
around dsx using our genome re-sequencing data. As predicted, we found 
support for an inversion polymorphism associated with mimicry alleles, 
the breakpoints of which flank dsx (Extended Data Table 4 and Extended 
Data Fig. 4). Given the long history of speculation about the molecular 
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identity of supergenes, it is interesting that we have uncovered a scen- 
ario that unites both possible explanations: reduced recombination 
among presumably different functional elements and single gene con- 
trol. In essence, the multiple, tightly linked loci proposed by Clarke and 
Sheppard’? may, in this case, actually be multiple, tightly linked muta- 
tions in the same gene. 

It is perhaps unexpected that a gene so intimately connected to an 
essential developmental process could be co-opted to also control intras- 
pecific polymorphism. Somehow, dsx has retained its highly conserved 
sex-differentiation properties'”*' while also evolving new phenotype- 
switching properties in just one sex. Our results suggest two comple- 
mentary mechanisms that may underlie the ability of dsx to have two 
distinct roles in P. polytes. First, although we found many mutations in 
the Dsx protein, none of these occurs in the DM or dimerization 
domains, which are essential components for its ancestral function 
in sexual differentiation. Second, we also found that different dsx iso- 
forms are expressed on the wings and in the body of females, which 
may also allow this one gene to carry out a novel function on the wings. 

R.A. Fisher called mimicry the “greatest post-Darwinian application 
of Natural Selection” and supergene mimicry stands out as a particu- 
larly extreme adaptive endpoint. Although little is known about the 
molecular and developmental basis of supergene mimicry, previous 
evidence suggests that multiple, tightly linked genes probably underlie 
this phenomenon. Here we have integrated multiple approaches to reveal 
that a single gene acts as the mimicry supergene in P. polytes. In so 
doing, we have greatly expanded the known role of doublesex and the 
sexual differentiation pathway generally. Female-limited mimetic poly- 
morphism has evolved independently multiple times in the genus Papilio’, 
making this a useful system in which to investigate the generality of our 
results. One might predict that the sex determination pathway, and dsx 
in particular, may have been co-opted repeatedly to control this phe- 
nomenon because this pathway is preconfigured to mediate the most 
widespread polymorphism in the animal kingdom—-sex. Interestingly, 
available data, although limited, suggest that this is not the case. For 
instance, female mimetic polymorphism in Papilio dardanus has prev- 
iously been mapped to a genomic region containing the genes engrailed 
and invected”’, which is not linked to dsx. Furthermore, female mimetic 


Figure 3 | Expression of doublesex in P. polytes. a, dsx is alternatively spliced 
into three female isoforms and one male isoform. b-d, Expression of female 
isoforms is strongly female-biased and isoform 1 (b) and isoform 2 (c) show 
wing-biased expression whereas isoform 3 expression (d) is body-biased; n = 6 
(female early), 6 (female mid), 5 (female late), 6 (male early), 6 (male mid), 

7 (male late). e, f, Female isoforms 1 and 2 also show elevated expression in 
mimetic females (polytes) relative to non-mimetic females (cyrus); n = 9 
(polytes early), 9 (polytes mid), 12 (polytes late), 3 (cyrus early), 3 (cyrus mid), 3 
(cyrus late). g, Finer scale temporal data for isoforms 1 and 2 on mimetic female 
wings suggests expression of both increases at 5 days after pupation; n = 3 for 
each time point. h, Immunodetection of Dsx protein on mimetic female wings 
5 days after pupation reveals strong correlation with adult wing pattern. Scale 
bars, 1 mm. Data represented as mean = s.e.m. All values indicate number of 
biological replicates. *P < 0.05; **P << 0.01, ANOVA and Tukey’s HSD test. 
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polymorphism in Papilio glaucus is sex-linked, with the primary switch 
locus on the W chromosome and a modifier on the Z chromosome”. 
Future work will determine whether other instances of sex-limited 
polymorphism, in butterflies and beyond, involve the sex differenti- 
ation pathway, but evolution, it seems, can take many paths to the 
extreme supergene genetic architecture, even among members of the 
same genus. 


METHODS SUMMARY 


Using one backcross mapping family (94 females: 48 cyrus and 46 polytes), we 
performed bulk segregant analysis with RAD markers. Subsequent fine mapping, 
using a total of nine mapping families (443 females: 229 cyrus and 214 polytes), and 
BAC sequencing isolated the mimicry locus to a 300-kb interval containing five 
genes, one of which was dsx. We sequenced the genomes of 30 laboratory-reared 
individuals (15 polytes and 15 cyrus) with an lumina HiSeq 2000 and generated a 
reference genome sequence for P. polytes using both de novo and reference-guided 
assembly. Single nucleotide polymorphism (SNP) calling of the 30 sequenced gen- 
omes yielded 675,526 genome-wide SNPs and 94,776 SNPs across a 4-megabase 
(Mb) scaffold containing dsx. GWAS was performed by calculating genetic differ- 
entiation (Fy) between polytes and cyrus individuals for de novo assembly scaffolds. 
Association tests across the 4-Mb dsx scaffold were performed using a false-discovery 
rate correction. We used Hudson-Kreitman-Aguadé (HKA) tests to compare nuc- 
leotide polymorphism among genes in the mimicry supergene region. Pairwise 
linkage disequilibrium was calculated among biallelic SNPs in two different por- 
tions of the dsx scaffold, and we used the short read sequence data to perform 
structural variant detection. We then used BLAST to identify scaffolds from a de 
novo assembly of polytes samples that appear to span an inversion containing dsx. 
Subsequent PCR tests isolated the 3’ breakpoint to a 2-kb interval. RNA-seq data, 
generated from wing-disc-derived P. polytes RNA, were used to perform transcrip- 
tome assembly and qRT-PCR was used to measure dsx isoform expression in 
males and females across development. We used a protein homology web server 
to infer secondary and tertiary structures of polytes and cyrus Dsx proteins, as well 
Dsx from Bombyx mori. Immunodetection of Dsx was carried out using a mono- 
clonal anti-Drosophila Dsx DM domain antibody. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 


Received 27 November 2013; accepted 30 January 2014. 
Published online 5 March 2014. 


1. Joron, M. & Mallet, J. L. Diversity in mimicry: paradox or paradigm? Trends Ecol. 
Evol. 13, 461-466 (1998). 

2. Kunte, K. The diversity and evolution of batesian mimicry in Papilio swallowtail 
butterflies. Evolution 63, 2707-2716 (2009). 

3. Kunte, K. Female-limited mimetic polymorphism: a review of theories and a 
critique of sexual selection as balancing selection. Anim. Behav. 78, 1029-1036 
(2009). 

4. Clarke, C. A. & Sheppard, P. M. Super-genes and mimicry. Heredity 14, 175-185 
(1960). 

5. Charlesworth, D. & Charlesworth, B. Theoretical genetics of Batesian mimicry Il. 
Evolution of supergenes. J. Theor. Biol. 55, 305-324 (1975). 

6. Charlesworth, D. & Charlesworth, B. Mimicry: the hunting of the supergene. Curr. 
Biol. 21, R846-R848 (2011). 

7. Fisher, R.A. The Genetical Theory of Natural Selection (Clarendon Press, 1930). 

8. Sheppard, P.M. The evolution of mimicry: a problem in ecology and genetics. Cold 
Spring Harb. Symp. Quant. Biol. 24, 131-140 (1959). 

9. Turner, J.R. G. in The Biology of Butterflies (eds Vane-Wright, R. |. & Ackery, P. R.) 
141-161 (Academic, 1984). 

10. Bates, H. W. Contributions to an insect fauna of the Amazon valley (Lepidoptera: 
Heliconidae). Trans. Linn. Soc. (Lond.) 23, 495-566 (1862). 


232 | NATURE | VOL 507 | 13 MARCH 2014 


1. Ford, E. B. The genetics of polymorphism in the Lepidoptera. Adv. Genet. 5, 43-87 
(1953). 

2. Mallet, J. & Joron, M. Evolution of diversity in warning color and mimicry: 
Polymorphisms, shifting balance, and speciation. Annu. Rev. Ecol. Syst. 30, 
201-233 (1999). 

3. Clarke, C.A. & Sheppard, P. M. The genetics of the mimetic butterfly Papilio polytes 
L. Phil. Trans. R. Soc. Lond. B 263, 431-458 (1972). 

4. Clarke, C. A., Sheppard, P. M. & Thornton, |. W. B. The genetics of the mimetic 
butterfly Papilio memnon L. Phil. Trans. R. Soc. Lond. B 254, 37-89 (1968). 

5. Joron, M. etal. Chromosomal rearrangements maintain a polymorphic supergene 
controlling butterfly mimicry. Nature 477, 203-206 (2011). 

6. Larracuente, A. M. & Presgraves, D. C. The selfish Segregation Distorter gene 
complex of Drosophila melanogaster. Genetics 192, 33-53 (2012). 

7. Takayama, S. & lsogai, A. Self-incompatibility in plants. Annu. Rev. Plant Biol. 56, 
467-489 (2005). 

8. Nijhout, H. F. Developmental perspectives on evolution of butterfly mimicry. 
Bioscience 44, 148-157 (1994). 

9. Burtis, K. C. & Baker, B. S. Drosophila doublesex gene controls somatic sexual 
differentiation by producing alternatively spliced mRNAs encoding related 
sex-specific polypeptides. Cel/ 56, 997-1010 (1989). 

20. Williams, T.M. & Carroll, S.B. Genetic and molecular insights into the development 

and evolution of sexual dimorphism. Nature Rev. Genet. 10, 797-804 (2009). 

21. Kopp, A. Dmrt genes in the development and evolution of sexual dimorphism. 
Trends Genet. 28, 175-184 (2012). 

22. Cho,S., Huang, Z. Y. & Zhang, J. Z. Sex-specific splicing of the honeybee doublesex 
gene reveals 300 million years of evolution at the bottom of the insect sex- 
determination pathway. Genetics 177, 1733-1741 (2007). 

23. Kijimoto, T., Moczek, A. P. & Andrews, J. Diversification of doublesex function 
underlies morph-, sex-, and species-specific development of beetle horns. 

Proc. Nat! Acad. Sci. USA 109, 20526-20531 (2012). 

24. Tanaka, K., Barmina, O., Sanders, L. E., Arbeitman, M. N. & Kopp, A. Evolution of sex- 
specific traits through changes in HOX-dependent doublesex expression. PLoS 
Biol. 9,e1001131 (2011). 

25. Williams, T. M. et a/. The regulation and evolution of a genetic switch controlling 
sexually dimorphic traits in Drosophila. Cell 134, 610-623 (2008). 

26. Loehlin, D. W. et al. Non-coding changes cause sex-specific wing size differences 
between closely related species of Nasonia. PLoS Genet. 6, e1000821 (2010). 

27. Charlesworth, B. & Charlesworth, D. Elements of Evolutionary Genetics (Roberts & 
Co., 2010). 

28. Hoffmann, A.A.,Sgro,C.M. & Weeks, A.R. Chromosomal inversion polymorphisms 
and adaptation. Trends Ecol. Evol. 19, 482-488 (2004). 

29. Clark, R. et al. Colour pattern specification in the Mocker swallowtail Papilio 
dardanus: the transcription factor invected is a candidate for the mimicry locus H. 
Proc. R. Soc. Lond. B 275, 1181-1188 (2008). 

30. Scriber, J. M., Hagen, R. H. & Lederhouse, R. C. Genetics of mimicry in the tiger 

swallowtail butterflies, Papilio glaucus and P. canadensis (Lepidoptera: 

Papilionidae). Evolution 50, 222-236 (1996). 


Acknowledgements We thank W. Wang for sharing genome sequence data, C. Robinett 
for providing the Dsx-DM monoclonal antibody, and E. Westerman, S. Nallu, M. Zhang, 
G. Garcia and N. Pierce for assistance and discussion. This project was funded by 
National Science Foundation grant DEB-1316037 to M.R.K. 


Author Contributions K.K. conceived the project and helped design the study, reared 
mapping families and samples for gene expression analysis and genome sequencing, 
performed bulk-segregant analysis and RAD mapping, and contributed to drafting the 
manuscript. W.Z. generated the reference genome sequences and transcriptome 
assemblies, performed association mapping, GWAS analysis, HKA tests, structural 
variant detection and linkage disequilibrium analyses, analysis of protein structure and 
synonymous/non-synonymous calculations, and contributed to drafting the 
manuscript. A.T.-T. assisted with butterfly husbandry, performed fine mapping, cDNA 
sequencing and qRT-PCR analyses. D.H.P. performed qRT-PCR analyses. A.M. and 
R.D.R. performed Dsx immunohistochemistry. S.P.M. helped design the project and 
contributed to drafting the manuscript. M.R.K. designed and directed the project, 
analysed data and wrote the manuscript. 


Author Information Sequence data are available from NCBI SRA (SRP035394) and 
GenBank (KJ150616-KJ150623). Reprints and permissions information is available 
at www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. Correspondence 
and requests for materials should be addressed to M.R.K. (mkronforst@uchicago.edu) 
or K.K. (krushnamegh@ncbs.res.in). 


©2014 Macmillan Publishers Limited. All rights reserved 


ates Tea 


doi:10.1038/nature13131 


Dynamic sensory cues shape song structure 


in Drosophila 


Philip Coen'”, Jan Clemens’”, Andrew J. Weinstein'?, Diego A. Pacheco’, Yi Deng*+ & Mala Murthy’? 


The generation of acoustic communication signals is widespread 
across the animal kingdom’”, and males of many species, including 
Drosophilidae, produce patterned courtship songs to increase their 
chance of success with a female. For some animals, song structure 
can vary considerably from one rendition to the next’; neural noise 
within pattern generating circuits is widely assumed to be the prim- 
ary source of such variability, and statistical models that incorporate 
neural noise are successful at reproducing the full variation present 
in natural songs‘. In direct contrast, here we demonstrate that much 
of the pattern variability in Drosophila courtship song can be explained 
by taking into account the dynamic sensory experience of the male. 
In particular, using a quantitative behavioural assay combined with 
computational modelling, we find that males use fast modulations 
in visual and self-motion signals to pattern their songs, a relation- 
ship that we show is evolutionarily conserved. Using neural circuit 
manipulations, we also identify the pathways involved in song pat- 
terning choices and show that females are sensitive to song features. 
Our data not only demonstrate that Drosophila song production is 
not a fixed action pattern®*, but establish Drosophila as a valuable 
new model for studies of rapid decision-making under both social 
and naturalistic conditions. 

Drosophila melanogaster males chase females during courtship and 
produce song by wing vibration; females, meanwhile, arbitrate mating 
decisions. We developed a behavioural chamber to record acoustic sig- 
nals and fly movements simultaneously (Fig. 1a and Supplementary 
Video 1); fly movements provide information on the sensory cues that 
may influence song production. We collected a large data set (> 100,000 
song bouts) to model the relationship between sensory cues and song 
patterning. Most experiments involve females that are pheromone- 
insensitive’ and blind (termed PIBL) to facilitate auditory response mea- 
surements. All fly types used are described in Extended Data Table 1. 

For one wild-type strain (WT1), we show that using arista-cut (deaf) 
females or wing-cut (mute) males increased the time to copulation and 
decreased the percentage of mated pairs (Fig. 1c). This corroborates 
prior work*” demonstrating the importance of song for courtship suc- 
cess. Pairing WT1 males with wild-type, rather than PIBL, females did 
not alter these results (Fig. 1c), nor any of the results described below 
(not shown). All wild-type strains showed similar success with PIBL 
females (Extended Data Fig. 1b). Courtship songs comprise two modes 
(sine and pulse; Fig. 1b) and are part of a genetically hardwired mating 
ritual, thought to be stereotyped*"’. However, we find frequent mode 
transitions and variable mode durations individualize song bouts (Fig. 1d 
and Extended Data Fig. 2). 

Males spend approximately 20% of courtship time singing (Extended 
Data Fig. 1c), and bouts can begin with either song mode. Using 
reverse correlation, we found that all tracking parameters correlate 
with bout initiations (Extended Data Fig. 3). We therefore turned to 
the generalized linear model (GLM) (Fig. 2a), widely used to analyse 
binary response data with several explanatory variables‘. Unlike 
reverse correlation”, the GLM we use includes a sparsity prior, which 
disentangles the contributions of correlated parameters to song patterning 


(see Methods)—this represents a major difference between our approach 
and previous studies’. 

Given similarities across fly strains (Extended Data Fig. 3), we com- 
bined data from all wild-type flies (315 pairs, 84,904 song bouts) for 
GLM analyses. We selected the most predictive features (= 600 ms of 
tracking parameter history) for each model based on deviance reduc- 
tion (Extended Data Fig. 4a, b). For pulse song starts, combining two 
features: male forward velocity (mFV) and male lateral speed (mLS) 
strongly improved model fit (Fig. 2b). When tested on separate data, 
the fraction of correctly classified song starts (PCor) was 0.67 (Fig. 2c), 
representing a 34% improvement over the null model (PCor = 0.5). This 
compares favourably with {MRI-based predictions of human behaviour’® 
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Figure 1 | A novel assay to study Drosophila song behaviour. a, Behavioural 
chamber with tracked fly movements (see Methods). Fly movements are 
divided into: male/female forward velocity (mFV/fFV), male/female lateral and 
rotational speeds (mLS/fLS and mRS/fRS), the distance between fly centres 
(Dis), the absolute angle from female/male heading to male/female centre 
(Ang1/Ang2). b, Segmentation of song bouts into pulse (red) and sine (blue) 
elements (top). Corresponding traces for mFV and fFV (bottom). c, Song is 
important for mating. Time to copulation increases (black, *P < 0.001) and 
fraction of copulated pairs decreases (red, *P < 0.01) when females are deaf or 
males are mute. Individual points, mean, and s.d. are given for each genotype 
(n = 35-48 pairs). AC, arista cut; WC, wing cut. d, Song is variable. The 
number of repeated bouts (containing pulse and sine) per fly (see Methods). 
n = 60 wild-type males. 
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Figure 2 | Song bout patterning is predictable and based on few features. 
a, Schematic of the GLM (see Methods). Inputs—stimulus histories (features; 
f (£)) for each movement parameter—are used to predict binary event 
probabilities. Significant features are convolved with a linear filter h(t), and 
the result, g(t), is transformed into a probability P(t), via a logistic function. 
Performance plots show the predicted and actual event probability 
relationships. Confusion matrices, from which we derive PCor values, quantify 
model performance. b, Filters for pulse and sine song initiation GLMs. Unlike 
male lateral speed (mLS) or male forward velocity (mFV), the Dis filter 
indicates a time lag between distance estimation and sine song initiation. 

c, GLM performance for identifying pulse song starts (PS) using male forward 
velocity and male lateral speed filters (n = 11,020 test events from 315 males) 
and sine song starts (SS) using the Dis filter (n = 2,476 test events from 315 
males). N = no song start. d, Male forward velocity pulse song start filters and 
Dis sine song start filters are similar for data from pheromone-insensitive or 
arista-cut males or males paired with arista-cut or sex-peptide-injected females; 
filters from wild-type males are also plotted. e, GLM performance for 
classifying current song mode (PM, pulse mode; SM, sine mode) using mean 
male forward velocity and male lateral speed (n = 55,464 test events from 315 
males). f, Filters for sine to pulse (S—P) transitions (top) and the pulse to sine 
(P-S) transitions (bottom). g, GLM performance for identifying S—P transitions 
(versus continued sine song (S-S)) using male forward velocity and male lateral 
speed filters (n = 17,118 test events from 315 males) and P-S transitions 
(versus continued pulse song (P-P)) using male forward velocity and female 
lateral speed filters (n = 11,748 test events from 315 males). Error bars (most 
too small to visualize) indicate 95% confidence intervals (c¢, e, g). 


and with two-alternative forced choice behavioural performance in 
Drosophila’’. PCor values are equivalent (7? = 0.98) to area under the 
curve values, an alternative performance measure (Extended Data 
Fig. 4c, d). We used a similar GLM framework to identify female for- 
ward velocity (fFV) as the best predictor of changes in male motion 
(male forward velocity (Extended Data Fig. 5a-c) and male lateral speed 
(data not shown)): that is, when the female speeds up, the male accel- 
erates to follow her. Therefore, any correlation between male motion 
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and song mode choice ultimately establishes a link between a sensory 
cue (for example, female motion) and song patterning. We address this 
point further below. 

For songs that start in sine mode, the optimal model included only 
the distance between fly centres (Dis) (Fig. 2b, c and Extended Data 
Fig. 6a). Song start filters derived from PI (pheromone-insensitive) 
or arista-cut males paired with PIBL females, or from wild-type males 
paired with arista-cut or unreceptive/sex-peptide-injected’* (SP) females, 
were indistinguishable from wild-type filters (Fig. 2d), even though 
males take longer to copulate with arista-cut females (Fig. 1c), and never 
copulate with unreceptive/sex-peptide-injected females (Extended Data 
Fig. 7a). A model designed to distinguish song bouts beginning in sine 
versus pulse mode retains male forward velocity and the distance be- 
tween fly centres as the most predictive features, but with significantly 
increased performance (Extended Data Fig. 6b, c). Therefore, we focus 
hereafter on song patterning decisions, rather than the male’s decision 
to sing versus perform another courtship behaviour. Here a decision 
refers to a behavioural choice biased by sensory information”. 

During song bouts, males typically alternate between sine and pulse 
modes, with each mode lasting tens to hundreds of milliseconds. We 
next investigated whether GLMs could also predict the current mode 
of song within bouts. Model performance was optimal using only two 
features: male forward velocity and male lateral speed (a 58% improve- 
ment over the null model; Fig. 2e). The absence of the distance between 
fly centres (Dis) feature in this model is probably due to its reduced 
variance during song (Extended Data Fig. 6d, e). Using different, male- 
centric, features only decreased model performance (Extended Data 
Fig. 8). We then went on to predict all mode transitions within a bout: 
increases in male forward velocity and male lateral speed predict tran- 
sitions to pulse mode, whereas decreases in male forward velocity and 
increases in female lateral speed predict transitions to sine mode (Fig. 2f). 
Mode transitions represent a subtle change in behaviour (for example, 
whether 300 ms of pulse song is followed by 30 ms of sine song or 30 ms 
of continued pulse song); nonetheless, our model predictions produced 
acombined PCor of 0.64 (Fig. 2g). Thus, taking into account male motion 
and inter-fly distance can largely explain variability in song patterning. 
Although studies in birds have shown that auditory cues, either produced 
by the singer itself”® or by a duetting partner”’, affect acoustic sequence 
generation, to our knowledge, ours is the first demonstration of a role 
for non-auditory sensory inputs in dynamically patterning the struc- 
ture of individual song sequences. 

Next, we considered which sensory pathways mediated the male’s 
decision-making during song production. Although male motion is 
the primary contributor to song patterning in our models, we observed 
a strong correlation (r° = 0.95) during song bouts between inter-fly 
distance (beyond the tactile range of ~5 mm; the tail of the distribution 
in Extended Data Fig. 6d) and the pulse/sine ratio (Fig. 3a; correlation 
is independent of male movement). We conclude that flies use vision 
to measure distance over this range, because blind males or wild-type 
males placed in the dark, sing significantly more pulse song (Fig. 3b, c); 
this is not true for any other sensory deficit and cannot be explained by 
changes in male speed (Extended Data Fig. 5e). 

Previous studies have demonstrated that separate neurons control 
pulse and sine song production”. This indicates that song patterning 
is neurally controlled, and does not arise simply from mechanical coup- 
ling with male locomotion changes. In support of this, males sing both 
song modes at all velocities (Fig. 3d). We further conclude that visual 
measurements of optic flow are not used to convey male motion signals 
to song patterning networks, because a model based on only male for- 
ward velocity and male lateral speed predicts current song mode for 
blind males (Fig. 3e). This left two likely possibilities (Fig. 3f): either 
(mechanism 1) a cue from the female induces males to change speed 
and concomitantly affects song patterning or (mechanism 2) neural 
circuits that carry information about male motion (via either a copy of 
the motor commands or proprioceptive feedback from the legs) modu- 
late song patterning circuits. Because female forward velocity predicts 
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Figure 3 | Neural pathways that modulate song patterning. a, Percentage of 
song in pulse mode (mean and s.d.) increases with inter-fly distance (77 = 0.95; 
see Methods). b, Pulse song percentage increases in blind males (m = 11-48 
flies, *P < 0.001). ¢, Individual WT2 males (paired with PIBL females) produce 
more pulse song in dark versus light (n = 5 flies, *P < 0.0001). d, Normalized 
event frequency for pulse or sine song (red or blue, n = 57,0225 or n = 95,2541, 
from 315 males) at each male forward velocity, across all wild-type strains. 

e, PCor values from classifying current song mode (using mean male forward 
velocity and male lateral speed) for wild-type strains (white bars) and various 
sensory manipulations (n = 924-16,256 test events from 11-48 flies for each 
model). f, Two potential neural circuit mechanisms underlying the correlation 
between male motion and song. Female cue(s) directly modulate both song 
patterning and locomotor circuits (left) or circuits carrying information about 
male motion (right) modulate song patterning circuits. g, Wild-type data 
were split into songs produced when females were not moving or moving 
(left, black or magenta, n = 9,454 or n = 46,204, test events from 315 males; see 
Methods). Inset shows corresponding male speeds. Corresponding PCor values 
from classifying current song mode using mean male forward velocity and male 
lateral speed (right). h, Song variability with TrpA1-activated flies (n = 14 
males from 3 genotypes) is similar to wild type (Fig. 1d). i, PCor values from 
classifying current song mode using mean mFV and mLS for Fru-A, Fru-B, and 
P1-activated males (n = 1,987 and 200 and 100 test events from 7 and 10 and 8 
males, respectively). j, For each genotype, the percentage of pulse song increases 
(*P < 0.01) when flies are fixed. k, Correlation between female forward velocity 
and the percentage of pulse song (n = 16,092 from 315 wild-type males paired 
with PIBL females, binned by percentile) at a 60 ms lag (17 = 0.91). Error bars 
indicate 95% confidence intervals (e, g, i). b and j show individual data points, 
mean and s.d. for each group. 


male motion (Extended Data Fig. 5a—c), song patterning would ulti- 
mately be dependent on sensory cues for both mechanisms. 

To distinguish between these mechanisms, we removed the link between 
female cues and male motion. Male lateral speed and male forward 
velocity still predicted current song mode when males are pheromone- 
insensitive and/or blind (Fig. 3e). Given that blind males do not follow 
females (Extended Data Fig. 5d), and produce song over a wide range 
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of inter-fly distances/orientations (Extended Data Fig. 8d-g), it seems 
unlikely that a non-visual, long-range cue from the female patterns 
song in these males. Further, model performance remains high when 
females provide no motion cues (Fig. 3g and Extended Data Fig. 9). 
Finally, we examined the link between male motion and song pattern- 
ing without a female present. We artificially activated song and tar- 
geted three neural subsets (see Methods). For all genotypes, levels of 
inter-bout variability were similar to wild-type levels despite constitu- 
tive thermal activation (Fig. 3h). Again, male lateral speed and male 
forward velocity predicted current song mode (Fig. 3i) within the per- 
formance range for wild-type strains (Fig. 3e). By preventing these 
males from moving, we observed a marked reduction in sine song (Fig. 3)); 
because this song mode is typically produced at lower male speeds, our 
results indicate that a copy of the locomotor commands (presumably 
still active in ‘fixed’ males) is more likely (than proprioceptive feedback) 
to pattern song. Thus, our data support the conclusion that activity in 
locomotor circuits influences song patterning, favouring mechanism 2 
(Fig. 3f). Consistent with this, we observed the strongest correlation 
(7° = 0.91) between female forward velocity and the pulse/sine ratio in 
wild-type flies at the delay at which males follow females (Fig. 3k and 
Extended Data Fig. 5b). 

As a final test of the importance of sensory cues in song patterning, 
we considered song bout terminations. The exponential distribution of 
syllable durations in songbirds has been proposed to support a stochastic 
mechanism for syllable termination”. Bout durations in Drosophila are 
also well fit by an exponential function (77 = 0.98; Extended Data Fig. 2). 
However, we identified female lateral speed as a significant predictor 
of bout ends (Fig. 4a, b). We posit that when males sense changes in 
female lateral speed they either transition to sine song (Fig. 2f) or they 
end song altogether. 

On the basis of the data presented, we propose that, for Drosophila, 
detection of a female” gates the song production pathway. Once gated, 
sensory cues act directly (inter-fly distance and female lateral speed) or 
indirectly (via male forward velocity and male lateral speed) to pattern 
song on short timescales (Fig. 4c). Although trial-to-trial variability in 
acoustic signals may be useful for song learning in birds**, Drosophila 
males do not learn their songs. Therefore, we considered alternative 
roles for patterning decisions in fly mating behaviours. Because female 
speed decreases before copulation (Fig. 4d), we reasoned that female 
slowing was a sign of receptivity. Indeed, using the GLM framework, 
we found a negative correlation between the amount of either song mode, 
in a given time window, and female speed (Fig. 4e). This relationship 
was reduced for deaf females, whereas unreceptive (SP) females showed 
a positive correlation between speed and song amount, a reversal of 
wild-type behaviour. In addition, females increased or decreased speed 
in response to song from Drosophila simulans (heterospecific) or WT1 
(conspecific) males, respectively. This is particularly striking when con- 
sidering that, for the same experiment, male motion predicts song 
mode choice for both D. simulans and D. melanogaster males (Fig. 4f). 
Therefore, male song patterning choices are unlikely to be used by 
females for mate selection. Indeed, successful males (those that copu- 
lated within 30 min) and unsuccessful males pattern song similarly 
(Extended Data Fig. 7b-d). We speculate that males bias towards their 
louder form of song (pulse) when far from the female (captured by 
the distance between fly centres feature) or when trying to catch up to, 
or locate, the female (captured by the male forward velocity and male 
lateral speed features); this would maximize the probability of the female 
hearing his song and decreasing her locomotion. It remains to be deter- 
mined, however, over which specific distances and angles females can 
detect each song mode (and amplitude modulations therein). 

In conclusion, instinctive behaviours, like acoustic signal production, 
have been generally considered to comprise a series of fixed action pat- 
terns, elicited and oriented by sensory information”’. Courtship song 
production in Drosophila has long been regarded as an example of such 
a fixed action sequence”®, with the female serving as the trigger stimulus. 
In contrast to this view, we show that even the simple fly uses sensory 
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Figure 4 | Song patterning decisions and female responses. a, Filter from 
GLM for song ends. b, GLM performance for identifying song ends (SE) using 
female lateral speed (n = 10,708 events from 315 males). N, no song end. 

c, Summary of the influence of sensory inputs on song patterning, as revealed 
by GLM analysis. d, Normalized changes in female motion before copulation 
(n = 233 flies). e, GLM coefficient values between pulse or sine song density and 


information to pattern his song sequences over short timescales. These 
data therefore offer a new window into the study of instinctive beha- 
viours, and indicate that song production in flies may be more ana- 
logous to complex motor behaviours, such as prey capture, known to 
rely heavily on sensory feedback for patterning. More broadly, and 
consistent with recent studies of fly flight and human mobility**”’, we 
suspect that seemingly stochastic behaviours may be more predictable 
than expected. 


METHODS SUMMARY 


Behavioural data (song and video recordings) were acquired by pairing virgin male 
and female flies in a custom-built chamber, designed to capture fly acoustic signals 
throughout the environment (~25 mm diameter, tiled with 9 microphones and 
connected to a specialized amplifier’) and to be compatible with our fly tracking 
software. Male wild-type strains came from diverse geographic locations; most 
females tested were genetically engineered to be both blind (GMR-hid) and phero- 
mone-insensitive (orco). Other genetic and physical manipulations included arista 
cutting (deaf flies), wing cutting (mute males), sex-peptide-injection and oenocyte 
removal. Neural activation was achieved by expressing TrpA1 in three different 
subsets of Fru* neurons and heating the entire chamber before introduction of 
male flies. All data processing and analysis was conducted in MATLAB. Song was 
segmented as previously described”. A modified generalized linear model"’, which 
uses a sparseness prior in order to penalize redundant features, was implemented 
to determine whether fly movements and positions could predict male song pat- 
terning choices, including bout initiation, song mode (pulse versus sine) within a 
bout, mode transitions and bout termination. When fitting or testing models over 
1,000 iterations, data were randomly subsampled to equalize the frequency of each 
event type (a common method for dealing with uneven event frequencies). Predictive 
features for each model were selected using deviance reduction and model perform- 
ance was tested using independent data sets. PCor values were used to quantify 
model performance. To measure female responses to song, female speed and amount 
of male song were compared using a 1 min sliding window, with 50% overlap. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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An excitatory paraventricular nucleus to AgRP 
neuron circuit that drives hunger 
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Hunger is a hard-wired motivational state essential for survival. 
Agouti-related peptide (AgRP)-expressing neurons in the arcuate 
nucleus (ARC) at the base of the hypothalamus are crucial to the 
control of hunger. They are activated by caloric deficiency and, when 
naturally or artificially stimulated, they potently induce intense 
hunger and subsequent food intake’*. Consistent with their oblig- 
atory role in regulating appetite, genetic ablation or chemogenetic 
inhibition of AgRP neurons decreases feeding**’. Excitatory input 
to AgRP neurons is important in caloric-deficiency-induced activa- 
tion, and is notable for its remarkable degree of caloric-state- 
dependent synaptic plasticity* °. Despite the important role of excit- 
atory input, its source(s) has been unknown. Here, through the use 
of Cre-recombinase-enabled, cell-specific neuron mapping techniques 
in mice, we have discovered strong excitatory drive that, unexpectedly, 
emanates from the hypothalamic paraventricular nucleus, specifically 
from subsets of neurons expressing thyrotropin-releasing hormone 
(TRH) and pituitary adenylate cyclase-activating polypeptide (PACAP, 
also known as ADCYAP1). Chemogenetic stimulation of these 
afferent neurons in sated mice markedly activates AgRP neurons 
and induces intense feeding. Conversely, acute inhibition in mice 
with caloric-deficiency-induced hunger decreases feeding. Discovery 
of these afferent neurons capable of triggering hunger advances 
understanding of how this intense motivational state is regulated. 

To identify monosynaptic inputs to AgRP neurons, we used a modified 
rabies virus SADAG-EGFP (EnvA)"' in combination with Cre-dependent 
helper adeno-associated viruses (AAVs) expressing TVA (receptor for 
the avian sarcoma leucosis virus glycoprotein EnvA; AAV8-FLEX- 
TVA-mCherry) and RG (rabies envelope glycoprotein; AAV8-FLEX- 
RG). When used with Agrp-IRES-Cre mice, TVA and RG, respectively, 
allow for rabies infection of AgRP neurons and subsequent retrograde 
transynaptic spread'"” (Fig. 1a). AAV targeting of the helper viruses was 
specific to AgRP neurons (Supplementary Fig. 1). Three weeks post- 
AAV transduction, we injected SADAG-EGFP (EnvA) into the same 
area and examined brains 7 days later for EGFP” signal. We detected 
the highest number of EGFP* cells in the ARC (38%), probably repre- 
senting the initially infected AgRP neurons, and possibly local afferents 
(Fig. 1b; Supplementary Fig. 2). We next evaluated distant upstream 
anatomical areas for EGFP neurons and found that the vast majority 
were located in two hypothalamic nuclei, the dorsal medial hypothal- 
amus (DMH, 26%) which contains both glutamatergic and GABAergic 
neurons” and the paraventricular hypothalamus (PVH, 18%) consist- 
ing primarily of glutamatergic neurons” (Fig. 1b; Supplementary Fig. 2). 
Finally, we also observed a smaller number of EGEP* cells in other 
hypothalamic sites (Supplementary Fig. 2). 


We next used channelrhodopsin (ChR2)-assisted circuit mapping 
(CRACM)'*"* to both confirm and determine valence of functional mono- 
synaptic connectivity between afferents and AgRP neurons. CRACM 
involves in vivo targeted expression of ChR2, a photoexcitable cation 
channel, in presumptive presynaptic upstream neurons (and their ter- 
minals), followed by ex vivo electrophysiologic assessment in acute 
brain slices of light-evoked postsynaptic currents in candidate down- 
stream neurons. To investigate excitatory input to AgRP neurons, we 
stereotaxically injected Cre-dependent AAV expressing ChR2-mCherry 
(AAV8-DIO-ChR2-mCherry) (Supplementary Fig. 3a) into brain sites of 
Velut2-IRES-Cre; Npy-hrGFP mice’’. VGLUT2 (also known as SLC17A6) 
is the glutamate synaptic vesicle transporter expressed in the hypothal- 
amus, hence Vglut2-IRES-Cre mice target relevant excitatory neurons”. 
As AgRP neurons co-express neuropeptide Y (NPY), Npy-hrGFP mice 
allow visualization of AgRP neurons'*’”. Consistent with the rabies trac- 
ing, we detected light-evoked excitatory post-synaptic currents (EPSCs) 
in all VGLUT2°™"> AgRP“*“ neurons tested (latency between onset 
of light and EPSC = 4.7 + 0.2 ms; Fig. 1c; Supplementary Fig. 3f). These 
were blocked by CNQX (6-cyano-7-nitroquinoxaline-2,3-dione), an 
AMPA receptor antagonist, confirming their glutamatergic nature. Next, 
we examined monosynaptic connections between VGLUT2" "> AgRPAR© 
neurons and again, consistent with the rabies mapping, we observed 
light-evoked EPSCs in all AgRP neurons tested (latency = 4.9 + 0.4 ms; 
Fig. 1d; Supplementary Fig. 3g). These also were blocked by CNQX. 

In addition, we selectively expressed ChR2 in the ventral medial 
hypothalamus (VMH) and lateral hypothalamus (LH), two sites with 
few EGFP‘ cells, and also the ARC, which could provide local afferents, 
and investigated possible connectivity to AgRP neurons. In agreement 
with the negative rabies data, no light-evoked EPSCs were detected in 
36 out of 37 VGLUT2" "> AgRP“®“ neurons tested (Supplementary 
Fig. 3b, h) or in any VGLUT2""'>AgRP**“ neurons tested (Supplemen- 
tary Fig. 3c, i). Likewise, we failed to detect light-evoked EPSCs in any 
VGLUT2*"“sAgRP**© neurons tested (Supplementary Fig. 3d, j). 
However, and as previously noted", glutamatergic VMH neurons were 
monosynaptically connected to nearby pro-opiomelanocortin (POMC) 
neurons (VGLUT2"™"_sPOMC“®*), as we observed light-evoked 
EPSCs in all POMC neurons tested (latency = 4.4 + 0.2 ms; Supplemen- 
tary Fig. 3e). 

The CRACM studies suggest marked differences in the strength of 
VGLUT2°Y "> AgRPA“®© versus VGLUT2° "> AgRP**“ inputs. First, 
the amplitude of light-evoked EPSCs generated from VGLUT2' ’" inputs 
were approximately threefold greater (Fig. le). Second, the effectiveness 
of light pulses in evoking EPSCs differed, with DMH inputs showing a 
much higher failure rate (~32% ; Fig. 1f; Supplementary Fig. 4) compared 
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Figure 1 | Mapping and evaluating connectivity of inputs to AgRP“®© 
neurons. a, Rabies schematic. b, EGFP detected in the ARC and DMH (left) 
and PVH (right) in Agrp-IRES-Cre mice. d(DMH, dorsal DMH; vDMH, ventral 
DMH. ¢, d, Top, schematic shows connections being tested. Right, 
representative brains from mice stereotaxically injected with AAV8-DIO- 
ChR2-mCherry (red = ChR2-mCherry, green = hrGFP from Npy-hrGFP). 
Bottom, representative traces showing assessment of light-evoked excitatory 
postsynaptic currents (EPSCs) with a blue tic indicating the light pulse (473 nm 
wavelength, 2 msec). Mice used were Vglut2-IRES-Cre; Npy-hrGFP. ChR2 was 


with PVH inputs in which time-locked EPSCs followed every light 
pulse (0% failure rate; Fig. 1g; Supplementary Fig. 4). To test for con- 
sequences of these synaptic strength differences, we assessed the abil- 
ity of VGLUT2°M"_s AgRPAR® or VGLUT2°Y"> AgRPA®® inputs 
to fire action potentials in AgRP neurons. Whereas we never detected 
VGLUT2°™"_s AgRP“*“ light-evoked action potentials (Fig. 1h), we 
observed VGLUT2' V+ AgRP“*“ light-evoked action potentials approx- 
imately 75% of the time (Fig. 1i), demonstrating potent, excitatory input 
to AgRP neurons originating from the PVH. 

To determine which PVH neurons monosynaptically drive AgRP 
neurons, we used many lines of Cre-expressing mice, each targeting 
subsets of PVH neurons, and then used CRACM to determine connec- 
tivity. Sim1 encodes for the single-minded homologue 1 transcription 
factor required for developmental specification of the PVH, and is thus 
expressed in most PVH neurons”, as verified in the Sim1-Cre transgenic 
mouse”’. We observed light-evoked EPSCs in all SIM1?Y"—> AgRPAR© 
neurons tested (latency = 4.6 + 0.4 ms; Fig. 2a). We next surveyed the 
Allen Brain Atlas for mRNAs highly enriched in the PVH, and then 
assembled a series of IRES-Cre knock-in mice targeting cells marked by 
prodynorphin (Pdyn-IRES-Cre; Supplementary Fig. 5), oxytocin (Oxt- 
IRES-Cre)"", arginine vasopressin (Avp-IRES-Cre; Supplementary Fig. 6), 
corticotropin-releasing hormone (Crh-IRES-Cre; Supplementary Fig. 7), 
TRH (Trh-IRES-Cre; Supplementary Fig. 8) and PACAP (Pacap-IRES- 
Cre; Supplementary Fig. 9). Following ChR2 delivery into the PVH, we 


targeted to DMH (c) and PVH (d). CNQX is an AMPA receptor blocker. 

e, Average amplitude of light-evoked EPSCs (pA) in AgRY neurons (n = 40 for 
VGLUT2°™™s AgRPA®© group; n = 45 for VGLUT2°’">AgRP“** group). 
Results are shown as mean = s.e.m. P values for unpaired comparisons were 
calculated by two-tailed Student’s t-test. **P < 0.001. f, g, Representative raster 
plots of EPSCs for VGLUT2°™™_s AgRP“®® (f) and VGLUT2°V "> AgRPAR© 
(g). h, i, Representative traces showing light-evoked changes in membrane 
potentials in AgRP neurons for VGLUT2°™™> AgRP“®° (h) and 
VGLUT2°¥"—> AgRPARS (i), 


failed to detect light-evoked EPSCs in all PDYNY"—>AgRP** (Fig. 2b), 
OXT?YH_> AgRPARS (Supplementary Fig. 10a), AVP? > AgRPARS 
(Supplementary Fig. 10b) or CRH’Y"—>AgRP“®© (Supplementary 
Fig. 10c) neurons tested. However, we detected light-evoked EPSCs 
in all TRH’ >AgRP“®© neurons (latency = 4.7 + 0.4 ms; Fig. 2c) 
and all PACAP’ Y">AgRP**“ neurons tested (latency = 4.8 + 0.3 ms; 
Fig. 2d). These findings demonstrate that TRH’Y" and PACAP?Y" 
neurons provide excitatory drive to AgRP neurons. It is not known if 
these excitatory inputs come from two distinct classes of PVH neurons 
or from one which co-expresses TRH and PACAP. The latter is pos- 
sible given that 37% of TRH mRNA-expressing neurons are marked 
by Pacap-IRES-Cre (Supplementary Fig. 11). 

Innervation by PACAP?" neurons suggests that PACAP, in addi- 
tion to glutamate, could also activate AgRP neurons. To assess this, we 
exogenously added PACAP) _3g (100 nM)” and determined effects on 
activity of synaptically isolated AgRP neurons (in the presence of picro- 
toxin (PTX) and kynurenate). Of note, PACAP) _33 markedly depolarized 
and increased firing rate, and this was prevented by co-addition of the 
PAC1-receptor blocker, PACAP, 3g (200 nM) (Fig. 2e-h). Thus, in addi- 
tion to glutamate, PACAP working through PAC] receptors also prob- 
ably plays an important role in the PACAP’ Y"' > AgRP“®S circuit. 

POMC and AgRP neurons lie in close proximity within the ARC, 
but have opposing effects on feeding*”’. If fidelity of the identified 
PVH->AgRPA®© neuron hunger circuit is to be high, then it should 
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Figure 2 | TRH’ and PACAP" neurons provide excitatory input to 
AgRP neurons. a-d, Top, schematic shows connections being tested. 

Right, representative brain sections of PVH injected with AAV8-DIO-ChR2- 
mCherry (white = ChR2—mCherry). Bottom, representative traces showing 
assessment of light-evoked EPSCs with blue tic indicating the light pulse 
(473 nm wavelength, 2 msec). Mice used were Sim1-Cre; Npy-hrGFP (a), 
Pdyn-IRES-Cre; Npy-hrGFP (b), Trh-IRES-Cre; Npy-hrGFP (c) and 
Pacap-IRES-Cre; Npy-hrGFP (d). CNQX is an AMPA receptor blocker. 


not also engage satiety-promoting POMC neurons. We evaluated this 
and found that the vast majority of TRH’ ’"!»POMC** neurons (21 
of 22) and all PACAP? "'-»POMC*®S neurons recorded are not mono- 
synaptically connected (Fig. 3a, b, respectively). However, consistent 
with our observation that VGLUT2’™"'>POMC*"“ neurons are con- 
nected (Supplementary Fig. 3e), and as a positive control for studies 
investigating PACAP->POMC**“ connections, we detected PACAP" 
—>POMC4®© monosynaptic connections (latency = 4.5 + 0.2 ms; Fig. 3c). 
These studies clearly demonstrate that the identified PVH AgRP“*° 
neuron circuit selectively drives hunger-promoting AgRP neurons, 
but not satiety-promoting POMC neurons. 

Fidelity should also exist in the downstream connections of orexi- 
genic AgRP neurons, which have reciprocal connections with the PVH. 
GABAergic AgRP neurons are monosynaptically connected with a 
subset of satiety-promoting neurons in the PVH, and this inhibitory 
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Figure 3 | Fidelity of TRH’ ¥"/PACAP? "> ARC and AgRP“®°_>PVH 
circuitry. a-f, Left, schematic shows connections being tested. Right, 
representative traces showing assessment of light-evoked EPSCs (a-c) or IPSCs 
(d-f) with blue tic indicating the light pulse (473 nm wavelength, 2 msec). Mice 
used were Trh-IRES-Cre; Pomc-hrGFP (a), Pacap-IRES-Cre; Pomc-hrGFP 

(b, c), Agrp-IRES-Cre; Sim1-Cre; R26-loxSTOPlox-L10-GEP (d) mice for 


240 | NATURE | VOL 507 | 13 MARCH 2014 


n=18/18 


«20 ms 


Firing rate (Hz) 
5 


e-h, Baseline and effects of PACAP (100 nM) with or without PACIR blocker 
(PACAP. 3; 200 nM) on membrane potential and firing rate of AgRP neurons 
(n= 18, n= 10, n = 8, respectively). Results are shown as mean + s.e.m. 

g, h, n = 18 for baseline group; n = 10 for PACAP, 3 group; n = 8 for 
PACAP) _33 + PACAP,_3g group. P values for pair-wise comparisons (baseline 
versus PACAP 3g group) and unpaired comparisons (baseline versus 
PACAP, 3g + PACAP¢_3g group) were calculated by two-tailed Student’s 
t-test. **P < 0.001. 


connection drives feeding®. Here we confirm this inhibitory AgRP“ “> 
PVH connection following ChR2 expression in AgRP neurons, as we 
detected light-evoked IPSCs (latency = 5.2 + 0.3 ms) in a subset (55%) 
of AGRPA®@>SIM1°Y! neurons (Fig. 3d), which were blocked by 
PTX. If fidelity of this hunger-promoting, GABAergic reciprocal 
AgRP“®“_sPVH connection is to be high, then it should not also 
engage and consequently inhibit the TRH?’ and PACAP?’ neu- 
rons. As expected, we detected no AgRPS > TRH’’" connections 
(Fig. 3e) and only rare AgRP“*“>PACAP’ “" connections (1 of 12 neu- 
rons tested) (Fig. 3f). The above studies demonstrate marked fidelity in 
the reciprocal TRH/PACAP’ “"' glutamatergic to AgRP“*© GABAergic 
to satiety’ neuron circuit that drives feeding (Fig. 3g). 

To confirm function of the TRH’”™ and PACAP?”= neurons, we 
targeted DREADDs (designer receptors exclusively activated by designer 
drugs; AAV8-DIO-hM3Dq-mCherry)’ to the PVH. Upon binding of 


visualization of SIM1 neurons, Agrp-IRES-Cre; Trh-IRES-Cre (e) mice injected 
with AAV8-DIO-mCherry into the PVH for visualization of TRH neurons and 
Agrp-IRES-Cre; Pacap-IRES-Cre; R26-loxSTOPlox-L10-GFP (f) mice for 
visualization of PACAP neurons. CNQX is an AMPA receptor blocker. PTX is 
a GABA, receptor blocker. g, Model summarizing reciprocal circuitry and 
its fidelity. 
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clozapine-N-oxide (CNO), hM3Dq activates neurons through the Gq 
signalling cascade”. After verifying in brain slices that CNO depolar- 
ized and increased firing frequency of TRH’” and PACAP" neurons 
(Fig. 4a, d, respectively), we found that acutely stimulating upstream 
TRH?" or PACAP?’" neurons markedly induced Fos activity in 
AgRP cells, ipsilateral to the DREADD-activated upstream PVH neu- 
rons (Fig. 4b, e, respectively). 

We next assessed effects of bilateral h3 MDq-DREADD-mediated 
TRH*Y" or PACAP neuron stimulation on feeding behaviour during 
the light cycle, a time when food intake is usually low because of feeding 
during the preceding dark cycle (as observed following saline injections, 


b 
Trh-IRES-Cre mice Trh-IRES-Cre mice 
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black lines in Fig. 4c, f, respectively). Importantly, acute activation by 
CNO injection of either TRH’ “"' or PACAP’ Y" neurons caused robust 
feeding (red lines in Fig. 4c, f, respectively). This effect was absent 
following an overnight fast which itself elevates excitatory drive and 
consequently AgRP neuron activity'*° (Supplementary Fig. 12). To 
directly demonstrate that this enhanced feeding was mediated through 
AgRP neurons, we activated TRH?YF neurons (AAV8-DIO-hM3Dq- 
mCherry) while simultaneously inhibiting AgRP“*“ neurons (AAV8- 
DIO-hM4Di-mCherry’; Fig. 4g). This significantly attenuated feeding 
(Fig. 4h). Finally, to examine whether endogenous activity of the TRH’ ’" 
— AgRP“®© pathway is physiologically relevant for feeding, we bilaterally 


Cc 
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Figure 4 | DREADD-mediated manipulation of TRH?’ or PACAP?" 
neurons mediates feeding through AgRP neurons. a, d, Membrane potential 
and firing rate of TRH?" (a) or PACAP?" (d) neurons transduced with 
AAV8-DIO-hM3Dq-mCherry upon CNO application. b, e, AAV8-DIO- 
hM3Dq-mCherry was transduced unilaterally into the PVH of Trh-IRES-Cre 
(b) or Pacap-IRES-Cre (e) mice and ipsilateral induction of Fos (red) was 
assessed in AgRP neurons (marked by hrGFP, green) 3 h following CNO 

(0.3 mg per kg) injection (n = 3, n = 3; respectively). Data represent 

mean + s.e.m. P values for unpaired comparisons were calculated by two-tailed 
Student’s t-test. *P < 0.05. ¢, f, Left, AAV8-DIO-hM3Dq-mCherry (white) was 
transduced bilaterally into the PVH of Trh-IRES-Cre (c) or Pacap-IRES-Cre 
(f) mice. Right, light-cycle food intake after injection of saline (black) or CNO 
(red; 0.3 mg per kg) (n = 7-8 animals per condition; experiment replicated 
three times per animal). n = 8 for Trh-IRES-Cre group; n = 7 for Pacap-IRES- 
Cre group. Data represent mean + s.e.m. Two-way repeated measures ANOVA 
detected significant interaction of ‘time’ and ‘treatment’. F3 9 = 56.13, 

**P < 0.001; F31g = 0.056, **P < 0.001; respectively. Sidak’s post-hoc test 
shows significant difference between ‘time’ and ‘treatment’ at 1 and 2h as 
indicated, respectively. g, h, Simultaneous inhibition of AgRPARC neurons with 
activation of TRH?” neurons attenuates food intake. g, Sagittal schematic of 
occlusion study, whereby AAV8-DIO-hM3Dq-mCherry (white) was 


Post i.p. injection CNO or saline (h) 


hM4Di-mCherry 


Post i.p. injection CNO or saline (h) 


transduced bilaterally into the PVH (top) and AAV8-DIO-hM4Di-mCherry 
(white) was transduced bilaterally into the ARC (bottom) of Trh-IRES-Cre; 
Agrp-IRES-Cre mice. h, Light-cycle food intake after injection of saline (black) 
or CNO (red; 0.3 mg per kg) in Trh-IRES-Cre; Agrp-IRES-Cre (dotted; hM3Dq 
in PVH and hM4Di in ARC) and Trh-IRES-Cre (solid; hM3Dq in PVH) mice 
(n = 4 animals per condition; experiment replicated three times per animal). 
Data represent mean + s.e.m. Two-way repeated measures ANOVA detected 
significant interaction of ‘genotype’ and ‘treatment’. F313 = 6.083, *P < 0.05. 
Sidak’s post-hoc test shows significant difference between ‘genotype’ and 
‘treatment’ as indicated. i, Membrane potential and firing rate of TRH?YH 
neurons transduced with AAV8-DIO-hM4Di-mCherry upon CNO 
application (top). AAV8-DIO-hM4Di-mCherry (white) was transduced 
bilaterally into the PVH of Trh-IRES-Cre mice (bottom). j, Dark-cycle food 
intake after injection of saline (black) or CNO (red; 0.3 mg per kg) (n = 4 
animals per condition; experiment replicated three times per animal). Results 
are shown as mean + s.e.m., *P < 0.05, **P < 0.001; see Supplementary 
Information for statistical analyses. Data represent mean + s.e.m. Two-way 
repeated measures ANOVA detected significant interaction of ‘time’ and 
‘treatment’. F313 = 3.962, *P < 0.05. Sidak’s post-hoc test shows significant 
difference between ‘time’ and ‘treatment’ at 3h as indicated. 
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transduced TRH? neurons with inhibitory DREADDs, which led to 
rapid hyperpolarization and decreased firing rate upon CNO applica- 
tion (as assessed in slice studies, Fig. 4i), and found that this manip- 
ulation drastically reduced food intake during the dark cycle (Fig. 4j). 
These studies demonstrate both the sufficiency and necessity of this 
PVH->AgRP*" orexigenic circuit in regulating food intake. 

Given the vast number of studies implicating an anorexigenic role 
for the PVH*”°”>*8, it is remarkable to now discover embedded with 
the PVH a subset of neurons that drive feeding, through a reciprocal 
circuit (Fig. 3g). Failure of prior studies to detect orexigenic activity in 
the PVH is probably due to the reciprocal nature of this hunger circuit 
(leaving and returning to the PVH) and the inhibitory aspect of its 
return arm (AgRP“®“>satiety’’™ neurons). 


METHODS SUMMARY 


All experiments were conducted according to US National Institutes of Health 
guidelines for animal research and were approved by the Beth Israel Deaconess 
Medical Center Institutional Animal Care and Use Committee. 

Mice. Pdyn-IRES Cre, Avp-IRES-Cre, Crh-IRES-Cre, Trh-IRES-Cre and Pacap-IRES- 
Cre and R26-loxSTOPlox-L10-GFP mice were generated using recombineering 
techniques as previously described'*””. 

Viral injections. Stereotaxic injections were performed as previously described**”’. 
Rabies cell counting. For each brain, images were taken throughout one entire 
brain series, EGFP™ cells were quantified and assigned to a specific anatomical 
structure, and subsequently expressed as a percentage of the total number of cells 
counted for each mouse, averaged over the entire cohort. 

Electrophysiology. Electrophysiology experiments were performed as previously 
described”. Latencies between onset of light and post synaptic currents are expressed 
as mean + s.e.m. 

In situ hybridization. Digoxigenin-labelled riboprobes against Pacap, Pdyn or 
Trh were hybridized to brain sections of Pacap-IRES-Cre; R26-loxSTOPlox-L10- 
GFP, Pdyn-IRES-Cre; R26-loxSTOPlox-L10-GEP or Trh-IRES-Cre mice injected 
bilaterally with AAV-DIO-GFP into the PVH as previously described with minor 
modifications”. Signals from intrinsic Cre-mediated GFP expression and in situ 
hybridization were compared to determine the degree of co-localization. 

Fos assay. Assessment of Fos induction used a previously developed method* mod- 
ified for fluorescent co-localization with hrGFP in AgRP neurons. 

Food intake. Food intake studies on chow were performed as previously described**. 
Statistics. Statistical analyses were performed using Origin Pro 8.6 and Prism 6.0 
(GraphPad) software. Feeding studies were run as within-subject design anda final 
consumption value for each animal obtained from an average of 3 trials. Data was 
analysed using a two-way repeated-measures ANOVA, interaction of ‘time’ and 
‘treatment’ or ‘genotype’ and ‘treatment’. *P < 0.05, **P < 0.001. Error bars indi- 
cate mean +/— s.e.m. For slice electrophysiology experiments and cell counting 
analyses, P values for pair-wise or unpaired comparisons were calculated by two- 
tailed Student’s t-test. *P < 0.05, **P< 0.001. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


Animals. All animal care and experimental procedures were approved by the Beth 
Israel Deaconess Medical Center Institutional Animal Care and Use Committee. 
Mice were housed at 22-24°C with a 12h light:12h dark cycle with standard 
mouse chow (Teklad F6 Rodent Diet 8664; 4.05 kcal gt 3.3 kcal g! metaboliz- 
able energy, 12.5% kcal from fat; Harlan Teklad) and water provided ad libitum. 
All diets were provided as pellets. Mice were euthanized by CO) narcosis. 
Generation of mice. Pdyn-IRES Cre, Avp-IRES-Cre, Crh-IRES-Cre, Trh-IRES-Cre 
and Pacap-IRES-Cre mice were generated using recombineering techniques as previ- 
ously described’*”’. Briefly, a selection cassette containing an internal ribosomal 
entry sequence (IRES) linked to Cre- recombinase and an Frt-flanked kanamycin 
resistance gene was targeted just downstream of the stop codon of the prodynor- 
phin, arginine vasopressin, corticotropin releasing hormone, thyrotropin releasing 
hormone or adenylate cyclase activating peptide 1 gene, respectively, in a bacterial 
artificial chromosome, so that Cre recombinase expression was driven by the 
endogenous genes. A targeting plasmid containing the Cre-containing selection 
cassette and 4 kb genomic sequence upstream and downstream of the prodynor- 
phin, arginine vasopressin, corticotropin releasing hormone, thyrotropin releasing 
hormone or adenylate cyclase activating peptide 1 stop codon, respectively was 
isolated and used for embryonic stem cell targeting. Correctly targeted clones were 
identified by long range PCR and injected into blastocysts. Chimaeric animals 
generated from blastocyst implantation were then bred for germline transmission 
of the altered prodynorphin, arginine vasopressin, corticotropin releasing hor- 
mone, thyrotropin releasing hormone or adenylate cyclase activating peptide 
1-allele, respectively. Flp-deletion mice were then used to remove the neomycin 
selection cassette. 

Generation of an enhanced Cre-dependent GFP reporter mice (R26-loxSTOPlox- 
L10-GFP) were generated using recombineering techniques as previously described”. 
A transgene containing a lox-flanked transcriptional blocking cassette followed by 
EGFP fused to the L10-ribosomal subunit*' was placed under the control of a 
CMV-enhancer/chicken B-actin promoter and targeted to the Rosa26 locus using 
standard techniques”. Correctly targeted blastocysts were identified by long range 
PCR and confirmed by Southern blotting and injected into blastocysts. 
Characterization of mice. Pdyn-IRES-Cre, Avp-IRES-Cre, Crh-IRES-Cre, and 
Pacap-IRES-Cre mice were crossed to R26-loxSTOPlox-L10-GFP mice, euthanized 
and sectioned at 30 um. Brain sections were washed in 0.1 M phosphate-buffered 
saline with Tween 20, pH 7.4 (PBST, 2 changes) and then incubated in the primary 
antiserum (anti-GFP, Abcam (1:1,000), rabbit anti-hrGFP. After several washes in 
PBS, sections were incubated in Alexa fluorophore secondary antibody (Molecular 
Probes, 1:200) for 2h at room temperature. After several washes in PBS, sections 
were mounted onto gelatin-coated slides and fluorescent images were captured 
with Olympus VS120 slide scanner microscope. 

Trh-IRES-Cre mice were stereotaxically injected with Cre-dependent AAV8-DIO- 
mCherry’, due to transient, early embryonic expression and subsequent deletion 
of floxed alleles by this mouse line, euthanized and sectioned at 30 um. Native 
fluorescence was used and fluorescent images were captured with Olympus VS120 
slide scanner microscope. All digital images were processed in the same way 
between experimental conditions to avoid artificial manipulation between differ- 
ent data sets. 

In situ mRNA hybridization experiments were performed for Pdyn-IRES-Cre, 
Trh-IRES-Cre and Pacap-IRES-Cre mice (details below). Antibody staining experi- 
ments were performed for Avp-IRES-Cre and Crh-IRES-Cre mice. Briefly, Avp- 
IRES-Cre; R26-loxSTOPlox-L10-GFP mice were co-localized with anti- AVP, Sigma 
(1:500), rabbit anti-A VP and anti-GFP, Abcam (1:1,000), chicken anti-GFP. After 
48 h intracerebroventricular colchicine treatment (Sigma, 10 1g), Crh-IRES-Cre; 
R26-loxSTOPlox-L10-GFP mice were co-localized with anti-CRF, Peninsula Labo- 
ratories (1:2,500), rabbit anti-CRF and anti-GFP, Abcam (1:1,000), chicken anti-GFP. 

All images were subsequently compared to in situ mRNA expression profiles 

generated by the Allen Institute for Brain Science; Allen Mouse Brain Atlas 2013 
(http://mouse.brain-map.org/). 
Breeding strategies. Vglut2-IRES-Cre, Sim1-Cre, Pdyn-IRES-Cre, Oxt-IRES-Cre, 
Avp-IRES-Cre, Crh-IRES-Cre, Trh-IRES-Cre and Pacap-IRES-Cre mice were bred 
to Npy-hrGFP transgenic mice to generate heterozygous Vglut2-IRES-Cre; Npy- 
hrGEP, Sim1-Cre; Npy-hrGFP, Pdyn-IRES-Cre; Npy-hrGFP, Oxt-IRES-Cre; Npy- 
hrGEP, Avp-IRES-Cre; Npy-hrGFP, Crh-IRES-Cre; Npy-hrGEP, Trh-IRES-Cre; 
Npy-hrGFP and Pacap-IRES-Cre; Npy-hrGFP mice, respectively as previously 
reported®. Vglut2-IRES-Cre and Pacap-IRES-Cre mice were bred to Pomc-hrGFP 
transgenic mice to generate heterozygous Vglut2-IRES-Cre; Pomc-hrGFP and 
Pacap-IRES-Cre; Pomc-hrGFP mice, respectively as previously reported’. 

Agrp-IRES-Cre, Pdyn-IRES-Cre, Avp-IRES-Cre, Crh-IRES-Cre and Pacap-IRES- 
Cre mice were bred to R26-loxSTOPlox-L10-GFP mice to generate heterozygous 
Agrp-IRES-Cre; R26-loxSTOPlox-L10-GFP, Pdyn-IRES-Cre; R26-loxSTOPlox-L10-GFP, 
Avp-IRES-Cre; R26-loxSTOPlox-L10-GFP, Crh-IRES-Cre;_ R26-loxSTOPlox-L10-GEP 
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and Pacap-IRES-Cre; R26-loxSTOPlox-L10-GFP mice, respectively. Agrp-IRES- 
Cre; R26-loxSTOPlox-L10-GFP mice were then bred to Sim1-Cre and Pacap-IRES- 
Cre to generate heterozygous Agrp-IRES-Cre; Sim1-Cre R26-loxSTOPlox-L10-GFP 
and Agrp-IRES-Cre; Pacap-IRES-Cre; R26-loxSTOPlox-L10-GFP mice, respec- 
tively. All mice are on a mixed background. 

Viral injections. Stereotaxic injections were performed as previously described’. 
Mice were anaesthetized with xylazine (5 mg per kg) and ketamine (75 mg per kg) 
diluted in saline (350 mg per kg) and placed into a stereotaxic apparatus (KOPF 
Model 963). After exposing the skull via small incision, a small hole was drilled 
for injection. A pulled-glass pipette with 20-40 mm tip diameter was inserted into 
the brain and virus was injected by an air pressure system. A micromanipulator 
(Grass Technologies, Model S48 Stimulator) was used to control injection speed at 
25nl min’ and the pipette was withdrawn 5 min after injection. For retrograde 
rabies tracing, AAV-FLEX-TVA-mCherry, serotype 8 (titre 1.1 X 10'” genomes copies 
per ml) and AAV-FLEX-RG, serotype 8 (titre 1.4 X 10’? genomes copies per ml)! 
were injected unilaterally into the ARC (200 nl, coordinates, bregma: AP:— 1.50 mm, 
DV:—5.80 mm, L: —0.20 mm) of 5-6-weeks-old mice. Then 21 days later SADAG- 
EGFP (EnvA) rabies (titre 10’ genomes copies per ml) was unilaterally injected into 
the ARC (400 nl, coordinates, bregma: AP:— 1.50 mm, DV:—5.80 mm, L: —0.20 mm). 
For electrophysiology experiments, AAV-DIO-ChR2(H134R)-mCherry, serotype 
8 (titre 1.3 X 10’? genomes copies per ml)”* was injected unilaterally into the DMH 
(20 nl, coordinates, bregma: AP: — 1.70 mm, DV: —5.00 mm, L: —0.25 mm), PVH 
(25 nl, coordinates, bregma: AP: —0.70 mm, DV: —4.75 mm, L: —0.20 mm), VMH 
(25 nl, coordinates, bregma: AP: —1.40mm, DV: —5.60 mm, L: —0.40 mm) or 
ARC (20 nl, coordinates, bregma: AP: -1.50 mm, DV: —5.80 mm, L: —0.20 mm) of 
5-6-weeks-old mice. For feeding and Fos studies, AAV-DIO-hM3Dq-mCherry, 
serotype 8 (titre 1.2 10'* genomes copies per ml)? or AAV-DIO-hM4Di- 
mCherry, serotype 8 (titre 1.7 X 10’? genomes copies per ml)? was injected bilat- 
erally or unilaterally, respectively into the PVH (25 nl, coordinates, bregma: AP: 
-0.70 mm, DV: —4.75 mm, L: —0.20 mm) of 5-6-weeks-old mice. For validation 
of Trh-IRES-Cre mice, AAV-DIO-mCherry, serotype 8 (titre 1.2 x 10” genomes 
copies per ml)* or AAV-DIO-GEFP, serotype 8 (titre 1.4 X 10’? genomes copies 
per ml) was injected bilaterally into the PVH (25nl, coordinates, bregma: AP: 
—0.70mm, DV: —4.75 mm, L: —0.20 mm) of 5—6-weeks-old mice. All viruses 
were packaged at the Gene Therapy Center at the University of North Carolina. For 
postoperative care, mice were injected intraperitoneally with meloxicam (0.5 mg 
per kg). All stereotaxic injection sites were verified under electrophysiological 
microscopy (for electrophysiology-related studies) or by immunohistochemistry 
(for anatomy and in vivo studies). All ‘misses’ or ‘partial hits’ were excluded from 
data analyses. 

SADAG-EGFP (EnvA) rabies cell counting. One week after SADAG-EGFP 
(EnvA) rabies injection, mice were perfused and brains were sectioned and mounted 
as described above. For each brain (n = 6), X10 magnification images were taken 
throughout one entire brain series using a Zeiss Axiolmager Z.1 microscope and 
EGFP cells were quantified using these images. Each EGFP™ cell was assigned to 
a specific anatomical structure of the hypothalamus using The Mouse Brain in 
Stereotaxic Coordinates (Franklin & Paxinos). As the number of labelled cell 
varied over multiple animals depending on the transduction rate of the viruses, 
the counted cells in each anatomical structure were expressed as a percentage of 
the total number of cells counted throughout each mouse brain. The percentages 
for each anatomical structure in a given mouse were then averaged over the entire 
cohort of mice (Supplementary Fig. 2). 

Electrophysiology and circuit mapping. For brain slice preparation, 6-10-weeks- 
old mice were anaesthetized with isoflurane before decapitation and removal of the 
entire brain. Brains were immediately submerged in ice-cold, carbogen-saturated 
(95% Oz, 5% CO,) high sucrose solution (238 mM sucrose, 26mM NaHCOs, 
2.5 mM KCl, 1.0 mM NaH,PO,, 5.0 mM MgCl, 10.0 mM CaCl, 11 mM glucose). 
Then, 300-uM thick coronal sections were cut with a Leica VT1000S Vibratome 
and incubated in oxygenated aCSF (126 mM NaCl, 21.4mM NaHCO;, 2.5mM 
KCl, 1.2mM NaH,PO,, 1.2mM MgCl, 2.4mM CaCl, 10mM glucose) at 34°C 
for 30 min. Then, slices were maintained and recorded at room temperature (20- 
24°C). The intracellular solution for voltage clamp recording contained the fol- 
lowing (in mM): 140 CsCl, 1 BAPTA, 10 HEPES, 5 MgCl, 5 Mg-ATP, 0.3 
Na,GTP, and 10 lidocaine N-ethyl bromide (QX-314), pH 7.35 and 290 mOsm. 
The intracellular solution for current clamp recordings contained the following 
(in mM): 128 K gluconate, 10 KCl, 10 HEPES, 1 EGTA, 1 MgCl, 0.3 CaCl, 5 
Na,ATP, 0.3 NaGTP, adjusted to pH 7.3 with KOH. 

Photostimulation-evoked EPSCs and IPSCs were recorded in the whole cell volt- 
age clamp mode, with membrane potential clamped at —60 mV. Photostimulation- 
evoked EPSCs was recorded in presence of picrotoxin (100 1M) to block inhibitory 
postsynaptic currents. All recordings were made using multiclamp 700B ampli- 
fier, and data were filtered at 2 kHz and digitized at 10 kHz. To photostimulate 
channelrhodopsin2-positive fibres, a laser or LED light source (473 nm; Opto 
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Engine; Thorlabs) was used. The blue light was focused on to the back aperture of 
the microscope objective, producing a wide-field exposure around the recorded 
cell of 10-15 mW mm”. The light power at the specimen was measured using an 
optical power meter PM100D (Thorlabs). The light output is controlled by a pro- 
grammable pulse stimulator, Master-8 (A.M.P.I.) and the pClamp 10.2 software 
(Axon Instruments). Photostimulation-evoked EPSCs/IPSCs detection protocol 
constitutes four blue light laser pulses administered 1 s apart during the first 4s of 
a 8s sweep, repeating for a total of 30 sweeps. When recording photostimulation- 
evoked action potentials in AgRP neurons, current was injected into cells to keep 
the base line membrane potential at approximately —60 mV. Mice with total misses, 
partial expression or expression outside the intended area were excluded from 
analysis after examination of mCherry expression. VGLUT2” MH_, AgRPARC n=4; 
VGLUT2°VUH_s AgRPARS n = 5; VGLUT2¥M"_s AgRPARS nn = 4; VGLUT2"4 
—>AgRP“®© n = 3; VGLUT24®—>AgRPA®© n = 3; VGLUT2°M4>POMCARS 
n= 3; SIM1PY4_>AgRPAR© n= 3; PDYN?PY# > AgRPARS n = 3; OXTPYE > 
AgRPAR© n = 3; AVPPYH_5 AgRPARC n = 3; CRHPY > AgRPARS nl = 3; TRH 
—>AgRPARS n = 3; PACAPPYH_ 5 AgRPARS n = 4; TRH’ SPOMCARS n = 4; 
PACAP’Y"sPOMC“F© n= 3; PACAPYMH_sPOMCARS n= 2; AgRPARC 5 
SIM1?Y" n = 2; AgRPA®S_>TRHPY" n = 3; AgRPARS_sPACAP PY" n = 3, 

To assess the effect of PACAP _3 (100 nM)” and PACAP,_3g (200 nM) onto 
AgRP neurons, we performed whole cell current clamp recordings on to 5-8-weeks- 
old Npy-hrGEP mice. Synaptic blockers (1 mM kynurenate and 100 LM picrotoxin) 
were added in aCSF to synaptically isolate AgRP neurons. 

To assess the effect of CNO on to TRH and PACAP neurons, 5- to 7-week-old 

Pacap-IRES-Cre and Trh-IRES-Cre mice were injected with AAV8-DIO-hM3Dq- 
mCherry into the PVH 2-3 weeks before recording. CNO was applied to bath 
solution through perfusion as previously reported®. After acquisition of stable 
whole-cell recordings for 2-5 min, aCSF solution containing 5 11M CNO was per- 
fused into the brain slice preparation. 
Fos analysis. Animals (Trh-IRES-Cre; Npy-hrGFP; n = 3 and Pacap-IRES-Cre; 
Npy-hrGEP; n = 3 mice), were handled for 10 consecutive days before the assay 
to reduce stress response, and then injected with CNO (0.3 mg per kg; ip.) at 9:00. 
Then 150 min later, the animals were euthanized with 7% chloral hydrate diluted 
in saline (350 mg per kg) for histological assay. The mice were perfused and brains 
were sectioned as described above. Assessment of Fos induction was performed 
using a previously developed method* modified for fluorescent co-localization 
with hrGFP in AgRP neurons. Brain sections were processed for immunohisto- 
chemical detection of Fos and hrGFP and counting. 

Brain sections were washed in 0.1 M phosphate-buffered saline with Tween 20, 

pH 7.4 (PBST, 2 changes) and then incubated in the primary antiserum (anti-GFP, 
Abcam (1:1,000), rabbit anti-hrGFP and anti-Fos, Calbiochem (1:10,000). After 
several washes in PBS, sections were incubated with Alexa fluorophore secondary 
antibodies (Molecular Probes, 1:200) for 2h at room temperature. After several 
washes in PBS, sections were mounted onto gelatin-coated slides and fluorescent 
X10 magnification images were taken initially using a Zeiss AxioImager Z.1 micro- 
scope. Co-localization and quantification was further determined using fluorescent 
X20 magnification images taken using a Zeiss AxioImager Z.1 microscope. Data 
are expressed as the percentage of all AgRP neurons (that is, all hrGFP-positive 
neurons) that were double-positive for Fos and hrGFP. 
Food intake studies. Food intake studies on chow were performed as previously 
described**”°. All animals (10- to 12-week-old male mice) were singly housed for 
at least 2.5 weeks following surgery and handled for 10 consecutive days before the 
assay to reduce stress response. Feeding studies were performed in home cages 
with ad libitum food access. CNO was administered at 0.3 mg per kg of body 
weight. Saline was delivered at the same volume to maintain consistency in the 
studies. Previous publications suggest the duration of the drugs effect at the dosage 
used throughout the studies is approximately 8 h**. Mice with ‘missed’ injections, 
incomplete ‘hits’ or expression outside the area of interest were excluded from 
analysis after post hoc examination of mCherry expression. In this way, all food 
intake measurements were randomized and blind to the experimenter. 


For light cycle measurements, animals (Trh-IRES-Cre; n = 8, Pacap-IRES-Cre; 
n= 7 mice; Fig. 4c, f, or Trh-IRES-Cre; Agrp-IRES-Cre; n = 4, Trh-IRES-Cre;n = 4 
mice; Fig. 4h) were injected with either saline or CNO (0.3 mg per kg; i.p.) at 9:00 
and food intake was monitored 1 h and/or 2 h after i.p. injection from 9:00 to 11:00. 
A full trial consisted of assessing food intake from the study subjects after they 
received injections of saline on day 1 and CNO on day 2. Animals received a day 
‘off between trials before another trial was initiated. The food intake data from 
all days following saline/CNO (n = 24 trials for Trh-IRES-Cre; n = 21 trials for 
Pacap-IRES-Cre; Fig. 4c, f; n = 12 for Trh-IRES-Cre; Agrp-IRES-Cre; n = 12 trials 
for Trh-ires-Cre; Fig. 4h) injections were then averaged and combined for analysis. 

For dark cycle measurements, animals (Trh-IRES-Cre; n = 4 mice; Fig. 4j) were 
injected with either saline or CNO (0.3 mg per kg; ip.) at 18:00 and food intake was 
monitored 1, 2 and 3 h after i-p. injection from 18:00 to 21:00. A full trial consisted 
of assessing food intake from the study subjects after they received injections of 
saline on day 1 and CNO on day 2. Animals received a day ‘off between trials 
before another trial was initiated. The food intake data from all days following 
saline/CNO (n = 12 trials for Trh-IRES-Cre; Fig. 4j) injections was then averaged 
and combined for analysis. 

For fast-refeed measurements, animals (Trh-IRES-Cre; n = 8, Pacap-IRES-Cre; 
n=7 mice; Supplementary Fig. 12) were fasted overnight at 17:00 and the follow- 
ing day were injected with either saline or CNO (0.3 mg per kg; ip.) at 8:30 and 
food intake was monitored 1.5, 2.5 and 4.5 h after i.p. injection from 9:00 to 13:00. 
A full trial consisted of assessing food intake from the study subjects after they 
received injections of saline on week 1 and CNO on week 2. Animals received a 
week ‘off between trials before another trial was initiated. The food intake data 
from all days following saline/CNO (n = 16 trials for Trh-IRES-Cre; n = 14 trials 
for Pacap-IRES-Cre; Supplementary Fig. 12) injections were then averaged and 
combined for analysis. 

In situ hybridization. Pacap-IRES-Cre; R26-loxSTOPlox-L10-GFP and Pdyn- 
IRES-Cre; R26-loxSTOPlox-L10-GFP mice were euthanized and brains were fro- 
zen fresh. Trh-IRES-Cre mice with bilateral AAV-DIO-GFP injections into the 
PVH were perfused with 4% PFA, post-fixed for 2 h and equilibrated in 20% sucrose 
overnight. Then 14-1.M cryosections were analysed by in situ hybridization as 
previously described* with minor modifications. Slides were fixed (4% PFA), washed 
(2X RNase-free PBS), coverslips were added and samples imaged for intrinsic 
Cre-mediated GFP fluorescence. Digoxigenin labelled anti-sense cRNA probes 
were generated by T3 (Roche) in vitro transcription reactions using the complete 
coding sequence of Pdyn (5' primer: 5'-ATGGCGTGGTCCAGGCTGATGC-3’; 
3' primer: 5’-TCAAACATCTAAATCTTCAGAATAGG-3’) and a 914-bp frag- 
ment of Pacap cDNA (5’ primer: 5’-CTGCGTGACGCTTACGCCCT-3’; 3’ primer: 
5'-TTGCCCCTGCAACCAGTGGG-3’). A previously described murine cDNA” 
(gift of Masanobu Yamada) was used to generate a Trh probe with SP6 RNA 
polymerase (Roche). Following hybridization, slides were incubated with anti- 
digoxigenin antibody conjugated to alkaline phosphatase (Roche, 1:200, 1 h room 
temperature), washed and incubated (3-6h) in NBT/BCIP chromogenic sub- 
strate according to the manufacturer’s specifications (Roche). Slides had cover- 
slips added and brightfield images were captured. Images were pseudocoloured 
and compared with intrinsic Cre-mediated GFP fluorescence for co-localization 
and quantification. 

Statistical analysis. Statistical analyses were performed using Origin Pro 8.6 and 
Prism 6.0 (GraphPad) software. 
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The transcription factors c-Myc and N-Myc—encoded by Myc and 
Mycn, respectively—regulate cellular growth’ and are required for 
embryonic development””’. A third paralogue, Mycl1, is dispensable 
for normal embryonic development but its biological function has 
remained unclear*. To examine the in vivo function of Mycll in 
mice, we generated an inactivating Mycl1®” allele that also reports 
Mycll expression. We find that Mycl1 is selectively expressed in den- 
dritic cells (DCs) of the immune system and controlled by IRF8, and 
that during DC development, Mycl1 expression is initiated in the 
common DC progenitor’ concurrent with reduction in c-Myc expres- 
sion. Mature DCs lack expression of c-Myc and N-Myc but maintain 
L-Myc expression even in the presence of inflammatory signals such 
as granulocyte-macrophage colony-stimulating factor. All DC sub- 
sets develop in Mycl1-deficient mice, but some subsets such as migra- 
tory CD103* conventional DCs in the lung and liver are greatly 
reduced at steady state. Importantly, loss of L-Myc by DCs causes a 
significant decrease in in vivo T-cell priming during infection by 
Listeria monocytogenes and vesicular stomatitis virus. The replace- 
ment of c-Myc by L-Myc in immature DCs may provide for Myc 
transcriptional activity in the setting of inflammation that is required 
for optimal T-cell priming®. 

c-Myc regulates cellular proliferation, metabolism and maintenance 
of progenitor populations”* and globally amplifies transcription by 
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Figure 1 | DCs selectively express Mycl1 but not Mycor Mycn. a, Shown are 
mean values for Myc, Mycn and Mycll expression in progenitors and DC 
subsets as described in the Methods (error bars, s.d., n = 2-4 biological 
replicates). CLP, common lymphoid progenitor; GMP, granulocyte— 
macrophage progenitor; LT-HSC, long-term haematopoietic stem cell; MPP, 
multipotent progenitor. b, c-Kit and GFP expression by FIt3* Lin” CD16/ 


direct interactions with RNA polymerases I, II and III, accounting for 
its regulation of disparate and context-dependent target loci across cell 
types””". Forced expression of L-Myc exerts weaker effects than c-Myc 
for cell growth, apoptosis and transformation’ but is more efficient in 
reprogramming fibroblasts towards induced pluripotent stem cells”. 
However, the in vivo function of L-Myc has not been established*. 

Mature DCs exhibit substantial proliferative activity'’*’*. In a 4-h 
in vivo 5-bromo-2’ deoxyuridine (BrdU) pulse labelling, B cells, mono- 
cytes and neutrophils showed a low rate of ~1% BrdU uptake (Extended 
Data Fig. 1a), whereas splenic conventional DC (cDC) subsets showed 
4-8% BrdU uptake, consistent with previous studies'*”*. In agreement, 
4-7% of cDCs were found to be in S/G2/M phase by DAPI (4’,6- 
diamidino-2-phenylindole) staining and 17-34% were in active cell 
cycle by Ki-67 staining (Extended Data Fig. 1b). Although splenic 
plasmacytoid DCs (pDCs) had little proliferative capacity, a significant 
fraction of pDCs in bone marrow were in S/G2/M phase by DAPI 
staining (Extended Data Fig. 1c). 

Myc and Mycn were highly expressed in various haematopoietic pro- 
genitor populations but were significantly reduced in mature DCs 
(Fig. 1a). By contrast, Mycl1 was expressed by common DC precursors 
(CDPs), committed precursors to cDCs° (pre-cDCs) and by mature 
splenic DCs (Fig. 1a), but not by other haematopoietic lineages* (Extended 
Data Fig. 1d, e), indicating that L-Myc expression replaces c-Myc and 
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CD32~ bone marrow cells from wild-type (WT), Myc®?’8? and Mycl1*/&? 
mice. c, c-Myc-GFP and L-Myc-GFP expression in the indicated progenitors 
and mice. Numbers indicate mean fluorescence intensities. d-f, GFP 
expression for mice in c for the indicated cell populations. Data are 
representative of at least 4 experiments (n = 10 mice). Mono, monocyte; 
PMN, neutrophil; RPM, red pulp macrophage. 
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N-Myc during DC development. We compared c-Myc and L-Myc 
expression at a single-cell resolution using Myc” reporter mice'® encod- 
ing a green fluorescent protein (GFP)—c-Myc amino-terminal fusion 
protein and a novel L-Myc allele, Mycl1®”, that substitutes gfp for the 
first coding exon (Extended Data Fig. 1f and Extended Data Fig. 2). 
c-Myc protein was highly expressed in Flt3* common myeloid pro- 
genitors (CMPs), was greatly reduced in CDPs and pre-cDCs (Fig. 1b, c) 
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Figure 2 | Mycl1 is restricted to DCs in lymphoid and peripheral tissues 
and regulated by IRF8 and GM-CSF. a, CD11c and major histocompatibility 
complex class II (MHC-II) expression is shown for GFP™ cells from spleen, 
inguinal lymph node (ILN) and mesenteric lymph node (MLN) from Mycl1*/8? 
mice (n = 5 mice). b, CD11b and GFP expression in CD45* CD11¢* MHC-II* 
cells from indicated tissues of wild-type (WT) and Mycl1*“” mice 

(n = 5 mice). c, Immunofluorescence for indicated markers from Mycl1 */sip 
mice. Scale bars, 200 tum. d, Western blot analysis for L-Myc and phospholipase 
C(PLC)-y2 from Flt3L-derived DCs from WT, Mycl18” /MP and Csf2rb-/ ~ mice, 
treated as indicated for 24h. HKLM, heat-killed L. monocytogenes strain EGD. 
Data in a-d are representative of at least 3 experiments. e, IRF8 binding in the 
Mycl1 locus determined by ChIP-seq in wild-type or Batf3’~ DCs. Numbers 
represent normalized reads. 


and was undetectable in mature splenic DCs (Fig. 1d, e). By contrast, 
Mycli&? was absent in CMPs, became detectible in CDPs and pre-cDCs 
(Fig. 1b, c), and was highly expressed in mature splenic CD8a* cDCs, 
CD8a° cDCs, and pDCs, but not in neutrophils, monocytes, red pulp 
macrophages, natural killer (NK) cells and T and B cells (Fig. 1d-f), con- 
sistent with Mycl1 messenger RNA expression (Extended Data Fig. 1d). 

DC subsets that developed in Mycl1®?/®? (L-Myc-deficient) mice 
showed no compensatory induction of Myc expression (Extended Data 
Fig. 1g, h). Mycl1®? expression was observed in DCs that developed 
from Fit3 ligand (Flt3L)-treated bone marrow cultures in vitro (Extended 
Data Fig. 3a, b). Retroviral overexpression of c-Myc, but not L-Myc, 
into Flt3" CMPs reduced the proportion of mature cDCs and pDCs 
that developed in Flt3L cultures (Extended Data Fig. 3c, d), suggesting 
that L-Myc may be non-redundant with c-Myc for DC development. 

We also compared Myc*” and Mycl1®” expression in other tissues. 
In inguinal and mesenteric lymph nodes, Mycl1®”, but not Myc®?, was 
expressed by pDCs and by migratory and resident cDCs (Fig. 2a and 
Extended Data Fig. 3e, f). In the lung, liver and dermis, Mycli&? was 
expressed predominantly by CD11b  cDCs, but, in the small intestine, 
Mycl1® was expressed by CD11b* and CD11b~ cDCs (Fig. 2b). Mycl 18” 
was more highly expressed in resident CD8* and the migratory 
CD103* cDCs than in CD11b* cDC subsets (Extended Data Fig. 3f) 
and was absent in macrophages in the peritoneum, kidney and liver 
(Extended Data Fig. 3g). Mycl1®?-expressing DCs were abundant in 
the T-cell zones of spleen and lymph nodes, and less frequent within 
B-cell follicles and the splenic red pulp (Fig. 2c). Sparse Mycl1&?- 
expressing DCs were present in the sub-capsular sinus of inguinal lymph 
nodes where Zbtb46" cDCs reside!” (Fig. 2c). In addition, Mycli&? was 
expressed by CD4 B220° cells in small intestinal lamina propria, 
inside villi and within Peyer’s patches (Fig. 2c). Mycl1®” expression was 
not expressed in vascular endothelium, unlike Zbtb46x? (Extended Data 
Fig. 4a, b). Thus, Mycl®? expression identifies DCs in both lymphoid 
and non-lymphoid peripheral tissues. 

Mycll expression by CMPs has been reported to be IRF8-dependent'®. 
In mice homozygous for the Irf8*7°* point mutation” that interrupts 
interactions between IRF8 and the transcription factor PU.1, Mycl 18? 
expression was absent in DC progenitors and substantially reduced in 
pDCs (Extended Data Fig. 5a, b). Moreover, Mycli&? was expressed in 
DCs differentiated from wild-type, but not Irfs’?*°, Ly6C™ mono- 
cytes using interleukin (IL)-4 and granulocyte-macrophage colony- 
stimulating factor (GM-CSF)’’” (Extended Data Fig. 5a, c). L-Myc 
protein was maintained in cDCs under various inflammatory conditions 
including treatment with interferon (IFN)-y and increased by treatment 
with GM-CSF (Fig. 2d and Extended Data Fig. 5d). Chromatin immu- 
noprecipitation combined with massively parallel sequencing (ChIP- 
seq) analysis identified several IRF8 binding regions across Mycl1 that 
did not require the transcription factor BATF3 (ref. 21) (Fig. 2e and 
Extended Data Fig. 5e-g). Together, these results support a role for IRF8- 
PU.1 interactions in Mycl1 expression. 

In lymphoid and peripheral tissues, absence of L-Myc decreased the 
total number and relative frequency of DCs, which competitive mixed 
bone marrow chimaeras suggested was due to a cell-intrinsic defect 
(Fig. 3a-c and Extended Data Fig. 6a-e). The largest reduction was to 
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Figure 3 | Mycl1 regulates DC proliferation and survival. a, Normalized 
donor contribution for indicated DC subsets in mixed bone marrow chimaeras 
(error bars, + s.d., n = 9 mice, Mann-Whitney U-test). b, CD103* CD11b— 
DCs from tissues from the indicated mice as percentage of CD45.2° cells 
(error bars, + s.d., n = 5-7 mice, two-way ANOVA Holm-Sidak post-hoc 
test). c, Donor contribution of CD103 CD11b- DCs as in a shown for 
indicated tissues. d, 1 h BrdU incorporation for indicated cells and mice 
(error bars, + s.d., n = 5, Student’s t-test). e, BrdU incorporation of cbD24* 
cDCs expressing ER™ or L-Myc-ER™” fusion proteins in response to tamoxifen 
(4-OHT) treatment (error bars, + s.d., n = 4, two-way ANOVA Holm-Sidak 
post-hoc test). f, 7-Aminoactinomycin D (7-AAD)/Annexin V staining of 
splenic CD8a* DCs from wild- type or Mycl1£”/S? (KO) mice treated with 
GM-CSF. g, Viable splenic CD8a* DCs from f for wild-type or Mycl1&?/s? 
mice (error bars, + s.d, n = 4, Student's t-test). Data in e-g are representative of 
2-3 experiments. h, Volcano plot of wild-type and Mycl1#”’8? CD8a* DCs 
treated with or without GM-CSF. Shown are genes increased >2-fold (red) or 
decreased >2-fold (blue) in wild-type relative to Mycl 18”? mice. The top 500 
genes induced in wild-type cells following GM-CSF treatment are shown 
(green) (n = 3 biological replicates, Welch’s t-test). *P < 0.05, ***P < 0.001, 
NS, P> 0.05. 


the CD103* CD11b- cDCs in the lung, an organ rich in GM-CSF”. 
Gene set enrichment analysis (GSEA) also revealed significant enrich- 
ment of cell-cycle-related transcripts in lung CD103* CD11b” cDCs 
as compared to migratory CD103* CD11b™ cDCs in draining lymph 
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nodes (Extended Data Fig. 6f, g). Thus, variations in the abundance of 
cell-extrinsic factors such as GM-CSF in the local tissue microenviron- 
ment result in different homeostatic requirements for DCs that are 
revealed by the loss of L-Myc. 

Analysis of gene expression microarrays from wild-type and L-Myc- 
deficient pDCs, CD8a~* cDCs, and CD8a~ cDCs identified a few pre- 
sumed L-Myc target genes associated with cellular proliferation and 
apoptosis (Extended Data Fig. 7a, b). In agreement, splenic CD8a* 
cDCs from L-Myc-deficient mice showed a 50% reduction in BrdU 
incorporation in vivo relative to wild-type mice (Fig. 3d). Also, a 
tamoxifen-activated L-Myc fusion protein (L-Myc-ER™, in which ER™ 
represents the mutant human oestrogen receptor) markedly and spe- 
cifically increased proliferation of CD8«* cDCs (Fig. 3e), suggesting 
that L-Myc can regulate DC proliferation. GM-CSF, a cytokine known 
to regulate DC homeostasis, increased CD80.* cDC cell size and expres- 
sion of >500 genes, including many involved in regulation of apoptosis 
(Extended Data Fig. 7c-f). Notably, the ability of GM-CSF to increase 
CD8a* cDC survival was impaired in the absence of L-Myc (Fig. 3f, g). 
Genes that appear to be targets of L-Myc in CD8«* cDCs include 
eukaryotic translation initiation factor 1 (Eif1) and NADH dehydro- 
genase (ubiquinone) Fe-S protein 5 (Ndufs5), which could affect global 
protein translation and energy metabolism. Furthermore, of the 500 
genes induced by GM-CSF treatment in wild-type DCs (Extended Data 
Fig. Ze), 442 are reduced in expression in L-Myc-deficient CD8a* 
cDCs (Fig. 3h), suggesting that the absence of L-Myc broadly limits 
inducible genes expressed in activated CD8a* cDCs. 

Finally, we assessed whether L-Myc expression was required for 
T-cell priming and other functions attributed to DCs****. We measured 
antigen-specific CD8* and CD4* T-cell responses after infection with 
L. monocytogenes expressing soluble ovalbumin (LM-OVA). Loss of 
L-Myc significantly decreased the total number of IFN-y-producing 
OVA-specific CD8* and CD4* T cells (Fig. 4a, b and Extended Data 
Fig. 8a, b). To demonstrate that these effects were not the result of a 
requirement for L-Myc in T cells, we adoptively transferred congeni- 
cally marked L-Myc-sufficient OVA-specific OT-I CD8* T cells into 
wild-type and L-Myc-deficient mice. After infection with LM-OVA, 
OT-I CD8* T-cell expansion was considerably reduced in L-Myc- 
deficient mice as compared to wild-type mice (Fig. 4c, d). L-Myc-deficient 
mice also showed impaired CD8* T-cell priming after infection with 
vesicular stomatitis virus expressing ovalbumin (VSV-OVA) (Extended 
Data Fig. 8c, d). These priming defects were attributable to the action of 
L-Myc in CD8a* cDCs, as depletion of pDCs”° or Notch2-dependent 
CD11b* cDCs” had no effect on CD8* T-cell priming after infection 
with LM-OVA (Extended Data Fig. 9a—c). Further, the defect did not 
appear to involve processing and presentation of soluble antigen a 
Data Fig. 10a, b). Mixed chimaera analysis using Zbtb46%”"“" mice?” 
indicated that the priming defect is intrinsic to cDCs (Extended Data 
Fig. 9d-g). 

We recently showed that CD8x* cDCs are required for L. monocyto- 
genes to establish infection in mice via the intravenous route, as Batf3‘~ 
mice lacking these cells are entirely resistant to lethal infection”®. Further, 
for the first 24h after infection, bacteria grow entirely within CD8a~ 
cDCs”, which are the initial reservoir for bacterial expansion. Because 
L-Myc is most highly expressed in the CD8«* cDCs in spleen, we asked 
whether L-Myc deficiency might influence infection by L. monocyto- 
genes. L-Myc-deficient mice were remarkably resistant to lethal infec- 
tion by L. monocytogenes relative to wild-type mice (Fig. 4e). This 
resistance was caused by significant reduction in the intracellular growth 
of bacteria within L-Myc-deficient CD8a* cDCs, and was not due to 
reduced bacterial capture or DC viability during the first 24h of infec- 
tion (Fig. 4f and Extended Data Fig. 8e-h). This reduced growth of 
L. monocytogenes was cell-intrinsic to CD8«* cDCs (Fig. 4g and Extended 
Data Fig. 10c, d) and was sufficient to prevent the subsequent spread of 
bacteria to other lineages (Fig. 4h and Extended Data Fig. 10e). 

The functional relationship of L-Myc to other Myc factors has remained 
uncertain’. We show that L-Myc is selectively expressed in DCs, maintained 
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during inflammation, and required by cDCs for optimal priming of 
T cells in bacterial and viral infection. As c-Myc is repressed by inter- 
ferons* and inducible genes can depend on c-Myc, L-Myc may provide 
a means to support transcriptional responses required during T-cell 
priming by cDCs. 


METHODS SUMMARY 

Mice. Wild-type 129S6/SvEv mice were from Taconic. Wild-type C57BL/6 mice, 
Csf2rb~/ ~ mice and the congenic strain B6.SJL-Ptpre*Pepe’/ BoyJ (B6.SJL) were 
from The Jackson Laboratory. Mice were maintained in our specific pathogen-free 
animal facility according to institutional guidelines. Generation of Myc®?/8?, 
Zbtb46""" and Irf8**© mice were described'*"”**. Inf?“ mice were backcrossed 
to C57BL/6 for 11 generations. Experiments used sex- and age-matched mice at 
6-16 weeks of age. All pathogen infections were performed on mice of the 129S6/ 
SvEv genetic background unless indicated. 
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Figure 4 | Mycl1 supports normal T-cell priming by DCs following 
infection but mediates resistance to lethal L. monocytogenes challenge. 

a, CD8* T cells from wild-type (WT) and Mycl1#”? (KO) mice infected with 
LM-OVA treated with SIINFEKL were analysed for TNF-« and IFN-y 
production. b, Total IFN-y* CD8* T cells were measured from the indicated 
mice after infection as described ina (error bars, + s.d.,n = 12, Student’s t-test). 
Data in a, b are from 2 independent experiments. c, CD45.1* OT-I T cells 
transferred into the indicated mice and infected with LM-OVA were measured 
after 7 days. Numbers are OT-I T cells as a percentage of all splenocytes. d, Total 
OT-I CD8* T cells were measured from the indicated recipient mice after 
infection as described in c (error bars, + s.d., n = 8, Student’s t-test). Data in 
c, d are from 2 independent experiments. e, Survival of WT and KO mice after 
infection with L. monocytogenes (n = 15, log-rank Mantel—Cox test). Data 
are from 3 independent experiments. f, L. monocytogenes was measured in 
purified CD8a* DCs after 2h or 24h of infection from e (error bars, + s.d., 
n = 4-6 biological replicates, Student’s t-test). g, Splenic CD8a* DCs from 
mice infected for 2h were cultured in vitro in media with the indicated 
antibiotic for 12 h and viable intracellular bacteria quantified (error bars, + s.d., 
n= A, Student’s t-test). Data in f, g are from 2 independent experiments. 

h, Viable intracellular bacteria were measured as in g from the indicated cells 
from WT and KO mice infected with L. monocytogenes for 60h (error bars, 
+ s.d., n = 3, Student’s t-test). **P < 0.001, NS, P> 0.05. 


Myc, Mycn and Mycll expression in Fig. 1a was determined by microarrays 
for the long-term haematopoietic stem cell, Flt3* multipotent progenitor, CMP, 
granulocyte-macrophage progenitor, common lymphoid progenitor, CDP, bone 
marrow pre-cDC, splenic pDC, splenic CD8%* DC and splenic CD8«~ DC. In 
Fig. 1b, lineage markers included Ter119, NK1.1, B220, MHC-II, CD3 and CD11b. 
For Fig. 1c, CMPs were Lin’ CD16/CD32 — Flt3* cKit*, CDPs were Lin” CD16/ 
CD32~ Fit3* cKit'"”~ CD115* and pre-cDCs were Lin” CD16/CD32~ Fit3* cKit™ 
CD1lic*. For Fig. 1d-f, gating is as follows: DCs, CD1lic* MHC-II'; neutrophils, 
Ly6G* CD11b*; monocytes, Ly6C* CD11b* Ly6G_ ; red pulp macrophages, auto- 
fluorescent F4/80*; NK cells, NK1.1* CD3e  ; CD8 T cells, CD3e" CD8a" CD4; 
CD4 T cells, CD3e* CD4* CD80"; B cells, CD19* B220" SiglecH”. In Fig. 2a, 
pDCs were gated as CD11c"*MHC-II'"~ cells; resident DCs as CD11¢* MHC- 
If" cells; and migratory DCs as CD11c'"*MHC-II" cells. In Fig. 2d, bone marrow 
cells were cultured for 9 days with Flt3L and treated with media, IL-4, IFN-y, GM- 
CSF or heat-killed L. monocytogenes strain EGD for 24h. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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A transcriptional switch underlies commitment to 
sexual development in malaria parasites 
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The life cycles of many parasites involve transitions between dispa- 
rate host species, requiring these parasites to go through multiple 
developmental stages adapted to each of these specialized niches. 
Transmission of malaria parasites (Plasmodium spp.) from humans 
to the mosquito vector requires differentiation from asexual stages 
replicating within red blood cells into non-dividing male and female 
gametocytes. Although gametocytes were first described in 1880, 
our understanding of the molecular mechanisms involved in com- 
mitment to gametocyte formation is extremely limited, and disrupt- 
ing this critical developmental transition remains a long-standing 
goal’. Here we show that expression levels of the DNA-binding protein 
PfAP2-G correlate strongly with levels of gametocyte formation. 
Using independent forward and reverse genetics approaches, we 
demonstrate that PfAP2-G function is essential for parasite sexual 
differentiation. By combining genome-wide PfAP2-G cognate motif 
occurrence with global transcriptional changes resulting from PfAP2- 
G ablation, we identify early gametocyte genes as probable targets 
of PfAP2-G and show that their regulation by PfAP2-G is critical 
for their wild-type level expression. In the asexual blood-stage para- 
sites pfap2-g appears to be among a set of epigenetically silenced 
loci’ prone to spontaneous activation’. Stochastic activation pre- 
sents a simple mechanism for a low baseline of gametocyte produc- 
tion. Overall, these findings identify Pf[AP2-G as a master regulator 
of sexual-stage development in malaria parasites and mark the first 
discovery of a transcriptional switch controlling a differentiation 
decision in protozoan parasites. 

From its uptake in a mosquito blood meal to initial infection of red 
blood cells in the subsequent host, the malaria parasite Plasmodium 
falciparum goes through at least seven key developmental changes (asex- 
ual red cell stage > gametocyte > gamete — ookinete — oocyst > 
sporozoite — liver stage — asexual red cell stage). In all but one case, 
as the parasite reaches its subsequent niche within the host, differenti- 
ation into the appropriate developmental stage is a necessity for con- 
tinuation of the life cycle. The lone exception occurs once the parasite 
has started replicating in red blood cells. During the 48-h intraerythro- 
cytic developmental cycle following each new red blood cell invasion, 
a developmental decision is made that determines whether daughter 
parasites will continue replicating asexually and maintain the infection 
of the current host or differentiate into non-dividing male or female 
gametocytes. Although the latter decision is a dead-end for replication 
within the current host it is essential for infection of mosquitoes and 
thus transmission to the next host*®. 

A recent study on transcriptional variation identified differentially 
expressed genes linked to early gametocyte development in two stocks 
(3D7-A and 3D7-B) of the common 3D7 P. falciparum parasite line’. 
Within this expression cluster of early gametocyte markers, we noted the 


presence ofa potential transcriptional regulator, PfAP2-G (PFL1085w/ 
PF3D7_1222600; http://www.plasmodb.org), which belongs to the api- 
complexan AP2 (ApiAP2) family of DNA-binding proteins (Supplemen- 
tary Fig. 1) and is conserved among most members of the phylum 
(Supplementary Fig. 2). ApiAP2 proteins represent the main family of 
transcriptional regulators in malaria parasites* and have thus far been 
found to regulate several of the parasite’s developmental transitions, 
including ookinete formation’ and oocyst sporozoite maturation”® within 
the mosquito, and development in the mammalian liver’’. Follow-up 
quantitative PCR with reverse transcription (qRT-PCR) analysis in 
blood-stage parasites confirmed higher pfap2-g transcript abundance 
in 3D7-B compared to 3D7-A and also revealed significant variation in 
expression levels between individual 3D7-B subclones (Fig. 1a). Notably, 
when gametocyte formation was measured in these lines, pfap2-g tran- 
script levels were highly predictive (R* > 0.99) of relative gametocyte 
production (Fig. 1b). 
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Figure 1 | pfap2-g transcript levels mirror gametocyte production. a, pfap2-¢ 
relative transcript abundance in synchronized (early schizont stage) cultures 
as measured by qPCR varies significantly between 3D7-A and 3D7-B 
populations as well as the 3D7-B subclones E5, A7 and B11. Values are 
normalized against seryl transfer RNA synthetase (PF07_0073) (n = 3, standard 
deviation shown). b, Per cent commitment to gametocyte differentiation in 
these lines mirrors relative pfap2-g transcript levels (mean of n = 2). 
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Ina parallel line of inquiry, we screened the well-studied gametocyte 
non-producer line F12 (refs 5, 6, 12), as well as a second parasite line 
(GNP-A4) that had also spontaneously lost its ability to produce game- 
tocytes, for mutations in protein-coding regions. Whole genome sequenc- 
ing of these lines revealed that the only gene containing mutations in 
both F12 and GNP-A4 was pfap2-g (Supplementary Table 1), resulting 
in the introduction of stop codons upstream of or within the AP2 DNA- 
binding domain (Fig. 2a and Supplementary Fig. 3). Previous studies 
identified subtelomeric deletions in the right arm of P. falciparum chro- 
mosome 9 that are associated with defective gametocyte production’*™*. 
The F12 and GNP-A4 clones do not have coding-sequence mutations 
or deletions within the chromosome 9 region, nor within any of the 16 
genes recently implicated in gametocyte development by random trans- 
poson mutagenesis’*. The presence of pfap2-g mutations in two inde- 
pendently derived gametocyte non-producer lines provides a second, 
independent connection between PfAP2-G and gametocyte formation, 
pointing to this locus as a key determinant of sexual differentiation. 
Although spontaneous inactivation of pfap2-g has occurred repeatedly 
in vitro, no loss-of-function mutations could be found in the genomes 
of nearly 300 distinct field isolates'®, further underlining its potential 
importance to transmission. 

To directly test the contribution of PfAP2-G function to gametocyte 
formation, we generated a PfAP2-G null mutant (Apfap2-g) via double 
homologous recombination in the high-gametocyte-producing 3D7-B 
subclone E5 (Fig. 2a, b and Supplementary Fig. 4). As predicted based 
on our earlier sequencing results, the Apfap2-g mutant completely lost 
the ability to produce gametocytes (Fig. 2c). To identify any additional 
mutations that may have been acquired in the extended process of gener- 
ating the pfap2-g knockout, we sequenced the genomes of both Apfap2-¢ 
and its E5 parent. Apart from the targeted deletion, we found only a 
limited number of additional mutations within coding regions, none 
of which are shared with the other non-producer lines that we sequenced 
or found in genes previously linked to gametocyte development'*"’° 
(Supplementary Table 1). This combination of forward and reverse 
genetic evidence strongly implies the essentiality of PfAP2-G for the 
production of gametocytes in P. falciparum. In direct competition 
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cultures Apfap2-g consistently outgrew its parent E5, consistent with 
the fact that PfAP2-G action occurs at or before the asexual/sexual deci- 
sion but not thereafter, as only a failure to initiate gametocytogenesis 
would provide an in vitro growth advantage (Supplementary Fig. 5). 

Attempts at generating full-length complementation expression con- 
structs were unsuccessful, probably owing to the considerable length 
(7.3 kilobases (kb)) of the coding sequence and its very low complexity 
(21.8% GC and long repeat sequences). As an alternative confirmation 
for the role of PfAP2-G in gametocyte formation, we made Pf{AP2-G 
function ligand-regulatable by appending the FKBP-derived desta- 
bilization domain (ddFKBP) to the 3’ end of the endogenous coding 
sequence (pfap2-g-ddfkbp, Supplementary Fig. 6a, b). In the absence of 
the synthetic ligand Shield-1 (Shld1) the ddFKBP domain is unstable 
and targets fusion proteins for proteolytic degradation'”"*, thus mak- 
ing PfAP2-G protein levels regulatable by the addition of Shld1 (Sup- 
plementary Fig. 6c). Indeed, in the pfap2-g-ddfkbp line gametocyte 
formation was completely dependent on the addition of Shld1, whereas 
its presence had no effect on gametocyte production by the E5 parent 
(Fig. 2d, e), demonstrating that PfAP2-G function is essential for game- 
tocyte formation. 

On the basis of the localization of haemagglutinin (HA)-tagged 
PfAP2-G to the parasite nucleus (Fig. 3a and Supplementary Fig. 7) 
and the fact that several ApiAP2 proteins act as transcriptional regu- 
lators, we aimed to identify possible regulatory targets of PfAP2-G. To 
do this, we compared the global transcriptional pattern over the 48-h 
intraerythrocytic cycle for the gametocyte-producing parent E5 to those 
of the mutant non-producers Apfap2-g and F12. As expected, only a 
small number of transcripts changed by greater than twofold in both 
mutants; with four transcripts increasing and 23 transcripts decreasing 
in abundance (Fig. 3b and Supplementary Table 2). All four upregulated 
genes are located in subtelomeric regions and have previously been shown 
to undergo spontaneous transcriptional variation and were therefore 
not considered further’. However, the cluster of downregulated genes 
is highly enriched for genes expressed during the first stages of game- 
tocyte formation (P < 0.003), including some of the earliest known mark- 
ers of sexual commitment: pfs16, pfg27/25 and pfg14.744 (refs 19, 20). 
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Figure 2 | Disrupting PfAP2-G function results in loss of gametocyte 
production. a, Positions of pfap2-g mutations in the gametocyte non-producer 
lines F12 and GNP-A4 and the targeted deletion of Apfap2-g. b, Southern 
blot showing successful disruption of the pfap2-g locus by homologous 
recombination (also see Supplementary Fig. 4). Single replicate. c, pfap2-g 


ES PfAP2-G-ddFKBP 


mutants fail to produce gametocytes (n = 3, standard error shown). 

d, Ligand-regulatable gametocyte formation in PfAP2-G-ddFKBP (bottom 
row images) but not in the E5 parent (top row images). Representive of n = 4. 
Scale bars, 5 jum. e, Quantification of ligand-regulatable gametocyte formation 
(n = 4, standard error shown). 
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Figure 3 | Identification of PfAP2-G targets. a, P[AP2-G-HAX3 localizes to 
the nuclei of schizonts in asexually growing parasites (see Supplementary Fig. 7 
for additional stages). Scale bar, 1 um. Representative of n = 8. BF, bright field; 
DAPI, 4’,6-diamidino-2-phenylindole. b, Relative abundance of transcripts 
with greater than twofold average difference in both Apfap2-g and F12 with 
respect to 3D7-B clone E5 across the intra-erythrocytic developmental cycle at 
6-h intervals. Columns on the right indicate whether genes are known 
gametocyte markers (blue), detected in two or more gametocyte proteomes 


qRT-PCR measurements of early gametocyte markers confirmed the 
lower relative abundance levels in Apfap2-g (Supplementary Fig. 8). Ana- 
lysis of the upstream regions of most downregulated genes showed that 
they were also enriched in the DNA motif recognized by Pf{AP2-G”" 
(P< 0.017). These results implicate PfAP2-G as a transcriptional switch 
that controls sexual differentiation by activating the transcription of 
early gametocyte genes. 

Using electrophoretic mobility shift assays we confirmed that the 
recombinant PfAP2-G DNA-binding domain could interact with three 
gametocyte promoters in a motif-dependent manner in vitro (Fig. 3c). 
To test whether this interaction occurs within the parasite, we trans- 
fected E5 and Apfap2-g with luciferase reporter constructs under the 
control of these gametocyte promoters (Fig. 3d). There was a signifi- 
cant reduction in luciferase activity in the Apfap2-g background com- 
pared to its E5 parent for all three constructs. In addition, luciferase 
levels were also significantly diminished in the parental E5 line when 
we altered the PfAP2-G recognition sequence in the two promoters 
tested, indicating that PfAP2-G probably acts as a direct transcrip- 
tional activator of the earliest gametocyte genes. 

The pfap2-g locus shares many features that have been associated 
with the epigenetic silencing of multigene families in P. falciparum*” 
such as high levels of the H3K9me3 histone modification’, associa- 
ted binding of heterochromatin protein 1 (PfHP1)*” and perinuclear 
localization’. On the basis of these data PfA P2-G expression is probably 
regulated epigenetically by reversible formation of repressive chromatin 
structures. Interestingly, we find that the pattern of histone modifica- 
tions at this locus is typical of heterochromatin-silenced genes in both 
the high gametocyte producer E5 and its low-producing A7 sibling clone 
(Supplementary Fig. 9a—c). This finding suggests that, in predominantly 
asexual blood-stage cultures, the pfap2-g locus is found in a hetero- 
chromatic (silenced) state in the majority of parasites and that the tran- 
scriptionally permissive state may only occur in a small number of sexually 
committed parasites. Indeed, the vast majority of asexually growing 
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(orange), enriched in early gametocyte proteome (purple), and the number of 
PfAP2-G cognate motifs within 2 kb upstream of the start codon. c, Binding of 
the recombinant PfAP2-G AP2 domain to three gametocyte promoters occurs 
only in the presence of the wild-type cognate motif (+). Representative of 

n = 3. d, Relative luciferase activity under the control of wild-type gametocyte 
promoters in 3D7-B E5 (blue) and Apfap2-g (red), or in 3D7-B E5 under 
control of promoters lacking the PfAP2-G motif (green). (18-30 h post- 
invasion, n = 3, standard error is shown, two-sided t-test used). NA, not tested. 


PfAP2-G-HAX3 (in which HAX3 denotes a triple HA tag) parasites 
contained no detectable levels of PfAP2-G by immunofluorescence, 
whereas a small subpopulation exhibited clear nuclear PfAP2-G staining 
(Fig. 4a). Every newly formed merozoite within PfAP2-G-expressing 
schizonts stained positive for PfAP2-G, lending further support to the 
previous findings that all daughter parasites from a given schizont are 
commited to the same developmental fate**. Furthermore, although the 
PfAP2-G-positive fraction varied between experiments, it was highly 
predictive of subsequent gametocyte formation in commitment assays 
(R? = 0.94, Fig. 4b). 

Stochastic, low-frequency activation would provide a simple mech- 
anism for baseline gametocyte production, which may be modulated 
in response to environmental stimuli. Furthermore, the presence of 
insulator-like pairing element sequences—which have been suggested 
to have an important role in the silencing of var genes**—flanking the 
pfap2-g locus (Supplementary Fig. 9d) raises the intriguing possibility 
that the expression of pfap2-g may be mutually exclusive with that of 
the var gene family”. In addition to chromatin-mediated control, PfAP2- 
G expression may be autoregulated via binding to the eight instances of 
the PfAP2-G cognate motifs located 2.1-3.6 kb upstream of the PfAP2-G 
locus (Supplementary Fig. 10). We have integrated these various reg- 
ulatory mechanisms into a model of how PfAP2-G expression controls 
the decision of individual cells to commit to gametocyte formation or 
to continue along the default pathway of asexual replication (Fig. 4c). 

Together with the work of Sinha et al.”* (accompanying manuscript), 
our results demonstrate that AP2-G is an essential regulator of game- 
tocyte formation in malaria parasites and acts as a developmental switch 
by activating the transcription of early gametocyte genes. This provides 
the first insight into the molecular mechanisms controlling the asexual/ 
sexual developmental decision in malaria parasites and unveils new 
targets in the long-standing aim of interrupting malaria transmission 
by preventing the formation and/or maturation of the parasite’s sexual 
stages’. Last, ligand-regulatable PfAP2-G is not only a powerful new 
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Figure 4 | Activation of PfAP2-G. a, Only a small fraction (1-6%) of asexually 
growing subclone 9A schizonts (see Supplementary Fig. 7 for details) express 
detectable levels of PfAP2-G-HA 3 (top row). H3K4me3 staining was 
performed in parallel to confirm full permeabilization (bottom row). Scale 


tool for future studies of sexual-stage development in malaria parasites 
but also holds great potential for inducible gene expression in general. 


METHODS SUMMARY 

Parasites and strains. Apfap2-g knockout parasites were generated by transfec- 
tion of 3D7-B E5 with pHHT-FCU-pfap2-g (Supplementary Fig. 4) followed by 
positive (hdhfr)/negative (fcu) selection. pfap2-g-ddfkbp parasites were generated 
by transfection of 3D7-B E5 with pJDD145-pfap2-g (Supplementary Fig. 6). Para- 
sites expressing PfAP2-G-HAx3 were generated by transfection of 3D7-B E5 with 
pHH linv-pfap2-g-HAx3 (Supplementary Fig. 7). All parasites were grown in media 
containing AlbuMAX II and synchronized by standard methods. 
Gametocytogenesis. Gametocyte induction was performed according to published 
methods”. For ligand-regulatable gametocytogenesis (Fig. 2d, e), synchronized 
parasites were set up at 0.5-1.0% late trophozoites in 3% hematocrit on day 0. 
Cultures were split in two and treated with 0.5 uM Shld1 or solvent control for the 
remainder of the experiment. 

Gel shifts. Electrophoretic mobility shift assays were performed using Light Shift 
EMSA kits (Thermo Scientific) using 2 ug of protein and 20 fmol of probe. 
Microarrays. Starting at 3 h post-invasion, tightly synchronized parasites were col- 
lected at eight time points with 6-h intervals. RNA isolation, complementary DNA 
generation/labelling, array hybridization, and feature extraction was performed as 
described previously**. Cy5-labelled cDNA was hybridized with a common Cy3- 
labelled reference pool on the P. falciparum 8X15K Agilent nuclear expression 
array (Gene Expression Omnibus (GEO) platform accession GPL17880). Genes 
were rank ordered by their average relative transcript abundance differences across 
the eight time points between the wild type (E5) and mutant (F12 or Apfap2-g). 
Luciferase assays. Equal numbers of synchronized, stably transfected parasites 
were isolated and saponin-lysed (0.05% in PBS) at ~18-30h post-invasion and 
assayed using Bright-Glo Luciferase Assay System (Promega). 

Next-generation sequencing and analysis. Next-generation sequencing of the 
3D7-B subclone E5 and Apfap2-g was performed using Illumina TruSeq single- 
end sequencing runs, analysed and visualized as described previously”. Genomic 
DNA for 3D7A, F12 and GNP-A4 was also used for whole genome sequencing at 
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bars, 10 jum. Representative of n = 4. b, The percentage of PEAP2-G-HA X3- 
positive cells is highly predictive (R” = 0.94) of subsequent gametocyte 
formation levels. c, Model of PfAP2-G activation and function. 


the Sanger Institute using Illumina GA II technology with 76-base paired-end reads. 
The raw sequence data were processed as described previously’. Experimental con- 
firmation of informative genomic variants was performed using capillary sequenc- 
ing methods. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 


Received 23 April; accepted 27 November 2013. 
Published online 23 February 2014. 


1. Wells, T. N. C., Alonso, P. L. & Gutteridge, W. E. New medicines to improve control 
and contribute to the eradication of malaria. Nature Rev. Drug Discov. 8, 879-891 
(2009). 

2. Flueck, C. et al. Plasmodium falciparum heterochromatin protein 1 marks genomic 
loci linked to phenotypic variation of exported virulence factors. PLoS Pathog. 5, 
e1000569 (2009). 

3. Lopez-Rubio, J. J., Mancio-Silva, L. & Scherf, A. Genome-wide analysis of 
heterochromatin associates clonally variant gene regulation with perinuclear 
repressive centers in malaria parasites. Cell Host Microbe 5, 179-190 
(2009). 

4. Cortés, A., Crowley, V.M., Vaquero, A. & Voss, T.S. A view on the role of epigenetics in 
the biology of malaria parasites. PLoS Pathog. 8, 1002943 (2012). 

5. Alano, P. Plasmodium falciparum gametocytes: still many secrets of a hidden life. 
Mol. Microbiol. 66, 291-302 (2007). 

6. Dixon, M.W.A., Thompson, J., Gardiner, D.L. & Trenholme, K. R. Sex in Plasmodium: 
a sign of commitment. Trends Parasitol. 24, 168-175 (2008). 

7. Rovira-Graells, N. et a/. Transcriptional variation in the malaria parasite 
Plasmodium falciparum. Genome Res. 22, 925-938 (2012). 

8. Painter, H. J., Campbell, T. L. & Llinas, M. The Apicomplexan AP2 family: integral 
factors regulating Plasmodium development. Mol. Biochem. Parasitol. 176, 1-7 
(2011). 

9. Yuda, M. etal. Identification of a transcription factor in the mosquito-invasive stage 

of malaria parasites. Mol. Microbiol. 71, 1402-1414 (2009). 

Yuda, M., lwanaga, S., Shigenobu, S., Kato, T. & Kaneko, |. Transcription factor 

AP2-Sp and its target genes in malarial sporozoites. Mol. Microbiol. 75, 854-863 

(2010). 


10. 


13 MARCH 2014 | VOL 507 | NATURE | 251 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


11. 
12. 
13. 


14. 
15. 


16. 
17. 
18. 


19. 
20. 


21. 


22. 
23. 
24. 
25. 
26. 


27. 
28. 


29. 


lwanaga, S., Kaneko, I., Kato, T. & Yuda, M. Identification of an AP2-family protein 
that is critical for malaria liver stage development. PLoS ONE 7, e47557 (2012). 
Alano, P. et al. Plasmodium falciparum: parasites defective in early stages of 
gametocytogenesis. Exp. Parasitol. 81, 227-235 (1995). 

Day, K. P. et al. Genes necessary for expression of a virulence determinant and for 
transmission of Plasmodium falciparum are located on a 0.3-megabase region of 
chromosome 9. Proc. Nat! Acad. Sci. USA 90, 8292-8296 (1993). 

Eksi, S. et a/. Plasmodium falciparum gametocyte development 1 (Pfgdv1) and 
gametocytogenesis early gene identification and commitment to sexual 
development. PLoS Pathog. 8, e1002964 (2012). 

Ikadai, H. etal. Transposon mutagenesis identifies genes essential for Plasmodium 
falciparum gametocytogenesis. Proc. Natl Acad. Sci. USA 110, E1676-E1684 
(2013). 

Manske, M. et al. Analysis of Plasmodium falciparum diversity in natural infections 
by deep sequencing. Nature 487, 375-379 (2012). 

Armstrong, C. M. & Goldberg, D. E. An FKBP destabilization domain modulates 
protein levels in Plasmodium falciparum. Nature Methods 4, 1007-1009 (2007). 
Banaszynski, L.A., Chen, L.-C., Maynard-Smith, L.A. Ooi, A.G.L. & Wandless, T.J.A 
rapid, reversible, and tunable method to regulate protein function in living cells 
using synthetic small molecules. Cel! 126, 995-1004 (2006). 

Pradel, G. Proteins of the malaria parasite sexual stages: expression, function and 
potential for transmission blocking strategies. Parasitology 134, 1911-1929 
(2007). 

Silvestrini, F. et al. Protein export marks the early phase of gametocytogenesis of 
the human malaria parasite Plasmodium falciparum. Mol. Cell. Proteomics 9, 
1437-1448 (2010). 

Campbell, T. L., De Silva, E. K., Olszewski, K. L Elemento, O. & Llinas, M. 
Identification and genome-wide prediction of DNA binding specificities for the 
ApiAP2 family of regulators from the malaria parasite. PLoS Pathog. 6, €1001165 
(2010). 

Guizetti, J. & Scherf, A. Silence, activate, poise, and switch! Mechanisms of 
antigenic variation in Plasmodium falciparum. Cell. Microbiol. 15, 718-726 
(2013). 

Pérez-Toledo, K. et al. Plasmodium falciparum heterochromatin protein 1 binds to 
tri-methylated histone 3 lysine 9 and is linked to mutually exclusive expression of 
var genes. Nucleic Acids Res. 37, 2596-2606 (2009). 

Bruce, M. C., Alano, P., Duthie, S. & Carter, R. Commitment of the malaria parasite 
Plasmodium falciparum to sexual and asexual development. Parasitology 100, 
191-200 (1990). 

Avraham, I., Schreier, J. & Dzikowski, R. Insulator-like pairing elements regulate 
silencing and mutually exclusive expression in the malaria parasite Plasmodium 
falciparum. Proc. Natl Acad. Sci. USA 109, 52 (2012). 

Sinha, A. et al. A cascade of DNA-binding proteins for sexual commitment and 
development in Plasmodium. Nature http://dx.doi.org/10.1038/nature12970 
(this issue). 

Fivelman, Q. L. et al. Improved synchronous production of Plasmodium falciparum 
gametocytes in vitro. Mol. Biochem. Parasitol. 154, 119-123 (2007). 

Kafsack, B. F. C., Painter, H. J. & Llinas, M. New Agilent platform DNA microarrays 
for transcriptome analysis of Plasmodium falciparum and Plasmodium berghei for 
the malaria research community. Malar. J. 11, 187 (2012). 

Straimer, J. et al. Site-specific genome editing in Plasmodium falciparum using 
engineered zinc-finger nucleases. Nature Methods 9, 993-998 (2012). 


252 | NATURE | VOL 507 | 13 MARCH 2014 
©2014 Macmillan Publishers Limited. All rights reserved 


30. Robinson, T. et a/. Drug-resistant genotypes and multi-clonality in Plasmodium 
falciparum analysed by direct genome sequencing from peripheral blood of 
malaria patients. PLoS ONE 6, e23204 (2011). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We would like to thank C. K 
echnical assistance and are grateful to O. Billker, C. Flueck, J. Kelly, C. Sutherland, 

A. Vaidya and A. Waters for discussion and reading of the manuscript. We would also 
ike to thank P. Alano for providing P. falciparum clone F12, C. Taylor for providing the 


P. falciparum GNP-A4 c 
polymorphisms in gam 


D. Goldberg for ShId1. 


agging construct pJDD145, C. Ben Mamoun for 


one, E. Thompson for iso 


ein, T. Campbell and A. Schieler for 


ating P. falciparum DNA for whole 


genome analysis, Z. Gorvett for assistance with confirming single nucleotide 


etocyte non-producing clones, M. Duraisingh for the ddFKBP 
the anti-PP2c antibody and 
Institutes of Health (NIH) grant RO1 


.L. is funded by Nationa 


Al076276 with support from the Centre for Quantitative Biology (PSOGMO7 1508). 


B.F.C.K. was supported by a Howard Hughes 
Runyon Cancer Research Foundation. D.A.B. is funded by Wel 


a Biotechnology and Biological Sciences Research Council CAS 
Pfizer as the Industrial partner. A.C. is funded by the Spanish 

nnovation grant SAF2010-20111.V.M.C. was supported by a 
Barcelona. T.G.C. is supported by the Medical 
and S.G.C. are supported through the Wellcome 
he Medical 
Governmen 


Trust (09805 


Author Contributions M.L. managed the overall project with in 


igand-regulatable gametocytogenesis experiments. V. 
validation. A.E.W. prepared Apfap2-g sequencing libraries and 


Research Counci 


edical Institute fellowship of the Damon 


come Trust grant ref. 


094752 and European Commission FP7 ‘MALSIG’ (ref. 223044). L.G.D. is supported by 


EPhD studentship with 
inistry of Science and 
fellowship from IRB 
UK (J005398). D.P.K. 
;090532/Z/09/Z) and 


Research Council UK (G0600230). C.B. is supported by the Catalan 
fellowship 201 1-BP-B 00060 (AGAUR, Catalonia, Spain). 


put from B.F.C.K., D.A.B. 


and A.C. B.F.C.K. generated the Apfap2-g knockout, PfAP2-G-ddFKBP and luciferase 
ines and designed, performed and analysed the microarray, gel shift, luciferase and 
.C. performed qRT-PCR 


together with B.F.C.K. 


analysed the sequencing data. D.A.B., T.G.C. and S.G.C. conceived the sequencing of 


gametocyte non-producer lines F12 and GNP-A4. T.G.C. analysed the gametocyte 
non-producer sequencing data and L.G.D. confirmed the SNPs by PCR. S.G.C. and 


D.P.K. carried out and supervised sequencing of gametocyte non-producer lines, 
respectively. A.C. and N.R.-G. generated E5 and other 3D7-B subclones and 


respectively supervised and performed the experiments prese 
and provided the analysis presented in Supplementary Fig. 1. 
performed and A.C. supervised chromatin immunoprecipitatio 
AC. generated the PfAP2-G-HA x3 line and carried out immu 
and correlations with gametocyte formation. B.F.C.K. wrote the 
input from M.L, D.A.B. and A.C. 


nted in Figs. 1 and 2b, 
V.M.C. and N.R.-G. 

nexperiments. C.B. and 
nofluorescence assays 
manuscript with major 


Author Information Microarray data was submitted to the NCBI GEO repository (series 
accession GSE52030). Next generation sequencing data was submitted to the NCBI 


Sequence Read Archive (SRA) (s' 
(ERSO11445), 3D7A (ERSO11446) and GNP-A4 (ERSO11447 


udy number ERPOO0190 for samples F12 


) and study number 


SRP035432 for samples E5 (SRS529791) and Apfap2-g (SRS52981 1)). Reprints and 
permissions information is available at www.nature.com/reprints. The authors declare 


no competing financial interests. 


Readers are welcome to comment on the online 


version of the paper. Correspondence and requests for materials should be addressed 


to M.L. (manuel@psu.edu). 


METHODS 

Parasites and strains. Parasite lines 3D7-A*', 3D7-B*! and F12' have been des- 
cribed previously. Note that 3D7-A is not the same line as the competent game- 
tocyte producer 3D7A*, which was used only as a reference genome for next- 
generation sequence analysis. 3D7-B subclones E5, A7 and B11 were generated by 
limiting dilution. The gametocyte non-producer line GNP-A4 was generated 
during an attempt to knockout a phosphodiesterase gene (Pf{PDE6, PF14_0672). 
Integration of the knockout construct by single crossover homologous recombina- 
tion occurred at the targeted locus but this event was not responsible for the clone’s 
inability to produce gametocytes’. A subsequent successful knockout of PfPDES 
produced gametocytes at normal rates and the true phenotype was a significantly 
lower exflagellation rate than parental parasites owing to a reduced ability of 
gametes to egress from red blood cells**. Apfap2-g knockout parasites were gen- 
erated by transfection of 3D7-B E5 with pHHT-FCU-pfap2-g (Supplementary 
Fig. 4) followed by positive (hdhfr)/negative (fcu) selection using WR99210 and 
5-fluoro-cytosine as described previously”. Resistant parasites were subcloned 
and verified by PCR and Southern blot. pfap2-g-ddfkbp parasites were generated 
by transfection of 3D7-B E5 with pJDD145-pfap2-g and selected on WR99210. 
After subcloning, integration was verified by PCR using a forward primer at 
position +4,269 and a ddFKBP reverse primer. Displacement of the endogenous 
downstream sequence was verified using primers at +4,269 and +7,490 with 
respect to the translation initiation site (Supplementary Fig. 6). Parasites express- 
ing a HAX3-tagged version of PfAP2-G were obtained by transfecting 3D7-B E5 
with the plasmid pHH1inv-pfap2-g-HA X3 and cycling twice on/off WR99210 to 
select for parasites where the plasmid has integrated into the genome. After sub- 
cloning by limiting dilution and Southern blot analysis (Supplementary Fig. 7b), a 
subclone with a single copy of the plasmid integrated at the pfap2-g locus (E5- 
pfap2-g-HA3 clone 9A) was selected for immunofluorescence assay (IFA) 
analysis. All parasites were grown in media containing AlbuMAX I] and synchro- 
nized by standard methods”. 

Knockout and ddFKBP-tagging constructs. Knockout construct: the region 
from — 126 base pairs (bp) to +366 bp and +6,945 to +7,379 bp with respect to 
the pfap2-g initiation codon were cloned into the NcoI/EcoRI and Spel/SaclI sites 
of pHHT-FCU™, respectively, to generate pHHT-FCU-pfap2-g. ddFKBP carboxy- 
terminal tagging construct: pfap2-g coding sequence positions +4,740 to +7,296 
were cloned into with NotI/Xhol sites of p)JDD145” (gift from M. Duraisingh). 
Luciferase expression constructs. The hdhfr selectable marker of pVLbIDh** was 
replaced with blasticidin-S deaminase using the SacI/Notl sites to generate pVL- 
BSD. The var7b promoter was excised with Hpal/Kpnl, blunted and re-ligated, 
destroying these sites. The 1,445 bp, 1,226 bp and 1,159 bp upstream of the pf11-1, 
pfg27/25 and pfpeg4 start codons were cloned into the AatII/Ncol sites. The PfAP2- 
G cognate motifs in the upstream sequences of pf11-1 (—328 to —323) and pfpeg4 
(—1,138 to 1,131) were converted to adenines using site-directed mutagenesis. 
HAX3-tagging construct. The plasmid pHHlinv-pfap2-g-HAX3 was derived 
from the plasmid E140-0°. A triple HA tag (HAX3) was cloned into KpnI-Xhol 
sites of E140-0, replacing the eba-140 open reading frame (ORF) and introducing 
several new restriction sites (plasmid pHH linv_HA 3). A fragment of the pfap2-g 
ORF from position +6,685 to the stop codon was PCR-amplified and cloned in- 
frame into KpnI-PstI sites of pHH1linv_HA 3, such that upon integration of the 
plasmid by single homologous recombination PfAP2-G is expressed as a fusion 
protein with the HAX3 tag, separated by the sequence YLQ. 
Gametocytogenesis. Gametocyte conversion rates for the 3D7-A, 3D7-B and 3D7-B 
subclones (Fig. 1) were measured by treating synchronized ring-stage parasite 
cultures at 5% parasitemia with N-acetyl-p-glucosamine and counting gametocy- 
taemia 3-4 days later. Gametocyte induction of E5, F12, GNP-A4 and Apfap2-g 
(Fig. 2c) was performed according to published methods” in media containing 5% 
AB+ heat-inactivated human serum and 0.25% AlbuMAX II. For ligand-regula- 
table gametocytogenesis (Fig. 2d, e), synchronized parasites were set up at 0.5- 
1.0% late trophozoites in 3% hematocrit on day 0. Cultures were split in two and 
treated with 0.5 uM Shld1 (gift from D. Goldberg) or an equal volume of ethanol 
solvent control for the remainder of the experiment. Parasitemia was determined 
on day3 and 50mM N-acetyl-p-glucosamine was added to all cultures for the 
remainder of the experiment. Gametocytemia was determined on day 9 and converted 
into per cent commitment by dividing by day3 parasitemia. Statistical signifi- 
cance in gametocyte production was determined using unpaired two-sided t-tests. 
Replicates were biological not technical. 

Growth competition. E5 and Apfap2-g parasites were mixed at 1:1 ratio and 
grown for 7 or more weeks in triplicate. Parasites were diluted 1:20 with uninfected 
erythrocytes and genomic DNA was isolated whenever parasitemia exceeded 10%. 
For each time point, gDNA was isolated from the technical replicates, pooled at 
equal concentration, used for PCR amplification of a 414-nucleotide (nt) (1,718- 
2,131) region of PFF0275c covering the single nucleotide polymorphism (SNP) 
(described in Supplementary Table 1) and sequenced using the reverse primer. 


LETTER 


Difference in the growth rate was determined using the relative sequencing read 
peak height of A and C across at least 16 replication cycles (32 or more days). 
Growth rate was fit to the data using: 


WT) =2=WTi=0/(WTr=-0+KOr=0 x (1+Ag)'”) 


KO,=,=1—WT;, = 


where WT, - , is the relative peak height of cytosine at day x, KO, = ,. is the relative 
peak height of adenine at day x, and Ag is the per cent difference in growth rate 
between Apfap2-g and E5. 95% confidence intervals were determined using 1.96 X 
standard error of the mean for the difference in growth rate. Replicates were 
biological not technical. 

Chromatin immunoprecipitation (ChIP). ChIP experiments were performed as 
described previously*®. In brief, cultures were synchronized to late trophozoites/ 
schizont stage, saponin-lysed and crosslinked using formaldehyde. Nuclei were 
released using a Dounce homogenizer (Kimble Chase) and DNA was subsequently 
fragmented using a Bioruptor (Diagenode). Immunoprecipitations were carried 
out using commercial antibodies against H3K9ac (Millipore 07-352) and H3K9me3 
(Millipore 07-442) and analysed by qPCR using the relative standard curve method. 
The primers used for ChIP analysis of the pfap2-g locus amplify positions (relative 
to the start codon) —4,954 to —4,875 (5'-1), —1,412 to —1,302 (5’-2), —449 to 
—35]1 (5'-3), +3,874 to +3,979 (ORF-1), +5,318 to +5,433 (ORE-2) and +8,492 
to +8,632 (3'-1). Primers for the control genes clag3. 1 (primer pair 5, beginning of 
the ORF), clag3.2 (primer pair 5, beginning of the ORF), ama-1 (primer pair 2, 
beginning of the ORF) and the var gene PFL1950w (upstream region, presumably 
5‘ untranslated region) have been described before**“’. Replicates were biological 
not technical. 

Western blots. At the late trophozoite stage synchronized parasites were treated 
with 0.5 1M Shld1 or ethanol solvent control for 36 h. Proteins were sequentially 
extracted as described previously’, separated on 4-12% polyacrylamide gels (Life 
Technologies) and assayed using anti-HA tag (Roche Diagnostics 11 867 423 001), 
anti-histone 3 (Abcam ab1791) and anti-PfPP2C (gift from C. Ben Mamoun). Rep- 
licates were biological not technical. 

qRT-PCR. pfap2-g transcript abundance measurements were carried out as previ- 
ously described’ using a primer set designed to amplify positions + 3,874 to +3,979 
and normalized to seryl-tRNA synthetase abundance. Statistical significance was 
determined using a two-sided t-test. Replicates presented are technical and repre- 
sentative of several biological replicates. 

IFAs. IFAs were performed on smears of E5-pfap2-g-HA X3-9A cultures synchro- 
nized to different stages. Air-dried smears were fixed for 10 min with 1% form- 
aldehyde and permeabilized for 10 min in 0.1% Triton X-100 in PBS. Experiments 
performed on smears fixed with 90% acetone/10% methanol yielded identical 
results (not shown). Smears were incubated with rabbit anti-HA (1:100; Life 
technologies 71-5500) or rabbit anti-H3K4me3 (1:10,000; Millipore 05-745) anti- 
bodies. Secondary anti-rabbit antibodies were conjugated with Alexa Fluor 488 (Life 
technologies A-11034). Nuclei were stained with DAPI. Importantly, no wild-type 
E5 parasites were positive for staining with anti-HA antibody, and secondary anti- 
body controls also yielded no signal. Preparations were observed under a confocal 
Leica TCS-SP5 microscope with LAS-AF image acquisition software and were pro- 
cessed using Image] software. The proportion of HA-positive schizonts was determined 
by counting >3,000 schizonts (identified by DAPI staining) for each experiment. 
The gametocyte conversion rate was measured for each of the same parasite cul- 
tures used to quantify the proportion of schizonts positive for anti-HA by IFA. 
Replicates were biological not technical. 

Gel shifts. Electrophoretic mobility shift assays were performed using Light Shift 
EMSA kits (Thermo Scientific) as previously described using 2 1g of protein and 
20 fmol of probe’'. Biotinylated double-stranded probes were designed using the 
24-nt flanking the PfAP2-G motif of the indicated upstream sequence. Probe 
sequences were as follows with capital letters indicating the AP2-G motifs and low- 
ercase letters indicating the flanking sequences: pfg27/25: 5’ -ttattagtatctGTACAC 
attggtatttgt-3', pfll-1: 5’-tatatatatattGTACACatacatgtagtt-3’, pfpeg4: 5’-gacaataa 
agaaGTGTACACatatatcaataa-3’. The motifs were replaced by an equal number of 
adenines for ‘no motif probes. Replicates were conducted using the same materials 
on separate days. 

Transcription profiling and associated analysis. Starting at 3h post-invasion, 
tightly synchronized parasites were collected at eight time points with 6-h intervals. 
RNA isolation, cDNA generation/labelling, array hybridization and feature extrac- 
tion was performed as described previously~*. Cy5-labelled cDNA was hybridized 
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with a common Cy3-labelled reference pool on the P. falciparum 8X 15K Agilent 
nuclear expression array (Gene Expression Omnibus (GEO) platform ID GPL17880). 
Relative transcript abundance was determined using a shared Cy3-labelled ref- 
erence pool. All microarray data was submitted to the NCBI GEO repository 
(series accession number GSE52030). Genes were ordered by their average relative 
transcript abundance differences across the eight time points between the wild 
type (E5) and mutant (F12/Apfap2-g). Occurrences of the trimmed (6-nt) PfAP2- 
G motif were mapped using ScanACE to intergenic regions up to 2,000 bp upstream 
of the start codon as previously described’ (see Supplementary Fig. 10 for motif). 
Significant enrichments of proteomic evidence and PfAP2-G motif occurrence 
were calculated using an unpaired two-sided t-test comparing the occurrences 
within the cluster of downregulated genes and their frequency genome-wide. Results 
were validated by qRT-PCR for a subset of downregulated genes using the primers 
in Supplementary Table 3 and methods described above. Statistically significant 
differences in relative expression levels were determined by two-sided t-test. 

Luciferase assays. Equal numbers of synchronized, stably transfected parasites were 
isolated and saponin-lysed (0.05% in PBS) at ~ 18-30 h post-invasion and assayed 
using Bright-Glo Luciferase Assay System (Promega) as per the manufacturer's 
protocol on a Synergy H1 (Bio-Tek) plate reader. Statistical significance was deter- 
mined using unpaired two-sided t-tests. Replicates were biological not technical. 
Next-generation sequencing and analysis. Genomic DNA was extracted (10 pg 
each) from E5, Apfap2-g, GNP-A4 and F12 parasite lines. This genomic DNA was 
used to generate barcoded sequencing libraries for an Illumina TruSeq single-end 
sequencing run, analysed and visualized as described previously”. Genomic DNA 
for 3D7A”, F12”” and GNP-A4”* was also used for whole genome sequencing at 
the Sanger Institute using Illumina GA II technology with 76-base paired-end reads. 
The raw sequence data were processed as described previously*’. In brief, the raw 
data for each isolate was mapped onto the 3D7 reference genome (version 3) using 
the SMALT short read alignment algorithm’. High-quality SNPs and insertions 
and deletions (supported by bidirectional reads, and error rates less than one per 
1,000 bp) in unique genomic regions were called using SAMtools (http://samtools. 
sourceforge.net). Regions of interest were inspected using the Artemis alignment 
viewer (http://www.sanger.ac.uk/resources/software/artemis/), and polymorphisms 
compared to publically available sequence data’®*”* processed as described above. 


Experimental confirmation of informative genomic variants was performed using 
capillary sequencing methods. 
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Commitment to and completion of sexual development are essential 
for malaria parasites (protists of the genus Plasmodium) to be trans- 
mitted through mosquitoes’. The molecular mechanism(s) respons- 
ible for commitment have been hitherto unknown. Here we show that 
PbAP2-G, a conserved member of the apicomplexan AP2 (ApiAP2) 
family of DNA-binding proteins, is essential for the commitment of 
asexually replicating forms to sexual development in Plasmodium 
berghei, a malaria parasite of rodents. PbAP2-G was identified from 
mutations in its encoding gene, PBANKA_143750, which account 
for the loss of sexual development frequently observed in parasites 
transmitted artificially by blood passage. Systematic gene deletion 
of conserved ApiAP2 genes in Plasmodium confirmed the role of 
PbAP2-G and revealed a second ApiAP2 member (PBANKA_ 103430, 
here termed PbAP2-G2) that significantly modulates but does not 
abolish gametocytogenesis, indicating that a cascade of ApiAP2 pro- 
teins are involved in commitment to the production and maturation 
of gametocytes. The data suggest a mechanism of commitment to 
gametocytogenesis in Plasmodium consistent with a positive feed- 
back loop involving PbAP2-G that could be exploited to prevent the 
transmission of this pernicious parasite. 

Malaria parasites spontaneously and stochastically produce sexual 
forms (gametocytes) required for mosquito transmission. Asexual para- 
sites commit to sexual development in the erythrocyte and the cell- 
cycle-arrested male and female gametocytes are available to initiate 
transmission when ingested within the blood meal of a female anophe- 
line mosquito. Gametocyte production may be lost when Plasmodium 
parasites are maintained either in continuous culture or by blood transfer 
between vertebrate hosts’. In a parasite line that produces fluorescently 
tagged gametocytes** we generated three gametocyte non-producer 
(GNP) lines (GNPm7, GNPm8 and GNPm49) that had verifiably lost 
the ability to undertake gametocytogenesis after 52 weeks of mechanical 
passage (Fig. 1a, Supplementary Fig. 1 and Supplementary Table 1). 

Subsequent developmental stages (gametes, ookinetes) were absent 
and none of the GNP lines could be transmitted through mosquitoes (Sup- 
plementary Fig. 2 and Supplementary Table 2). Whole-genome sequenc- 
ing of these and an existing GNP line (ANKA 2.33) revealed numerous 
single nucleotide polymorphisms (SNPs) and insertions or deletions 
(indels) per line (Supplementary Fig. 3 and Supplementary Table 3); 
however, only a single gene, PBANKA_ 143750, carried a different and 
therefore independent nonsense or missense mutation in each line (Fig. 1b). 
PBANKA_ 143750 (here termed pbap2-g) encodes a putative transcrip- 
tion factor predicted to be composed of 2,330 amino acids with a single 
55-amino-acid AP2 class DNA-binding domain (DBD) at its carboxy 
terminus (Fig. 1b). P»bAP2-G belongs to the 27-strong*’ Plasmodium 
ApiAP2 family of transcription factors, themselves part of the larger 
Apetala 2/ethylene response factor (AP2/ERF) family of transcription 
factors restricted to the Plantae and apicomplexan protists. The role of 


PbAP2-G in gametocyte production was confirmed either by correct- 
ing the mutations in pbap2-g in the GNP lines through genomic recom- 
bination with a wild-type copy (generating GNPm7REP, GNPm8REP, 
GNPm9REP and 2.33REP) or genetic complementation of a targeted 
deletion mutant of pbap2-g (Fig. 1c and Supplementary Fig. 4a—g). Func- 
tionality of the restored gametocytes was demonstrated in GNPm7REP 
and 2.33REP by transmission through mosquitoes (Fig. 1d and Sup- 
plementary Table 4). Disruption of a second ApiAP2 gene, PBANKA_ 
103430 (pbap2-g2) (Fig. 1b), resulted in the nearly complete (>95%) 
loss of mature gametocytes, but in contrast to pbap2-g_ parasites, small 
numbers of female gametocytes were occasionally observed (Fig. 1c). 
These were not, however, transmitted successfully to mosquitoes. In 
direct growth competition assays pbap2-g parasites outgrew wild-type 
P. berghei and pbap2-g2 parasites, which had wild-type growth rates 
(Fig. le and Supplementary Fig. 5). pbap2-g mutants are therefore 
uniquely capable of converting a loss of gametocytes into increased 
asexual growth, which confers an advantage during asexual growth and 
explains why continued blood passage invariably selects for mutations 
in pbap2-g. This demonstrates that PbAP2-G functions specifically at 
the point of commitment, whereas PbAP2-G2 is required downstream, 
once sexual differentiation has become irreversible (Fig. le). 

In a protein-binding microarray the recombinant DBD of PbAP2- 
G*” recognized closely related DNA motifs (Fig. 2a and Supplementary 
Table 5) identical to the previously derived motif for the DBD from the 
orthologous ApiAP2 protein of Plasmodium falciparum (PF3D7_1222600)°, 
confirming that both DBDs bind primarily to the same (GxGTACxC) 
motif (in which x denotes any residue). Electrophoretic mobility shift 
assay (EMSA) analyses (Fig. 2a) refined the motif to two 6-mers (GxGTAC 
and GTACxC, which are essentially palindromes of each other) that 
are sufficient and necessary for binding. A single point mutation in the 
core GTAC was sufficient to abrogate binding (Fig. 2a). These two motifs 
occurred within 2 kilobases (kb) upstream of 49% of all genes (2,359 of 
4,803 considered), yet more frequently in genes designated as upregu- 
lated in gametocytes (246 (54%) of 452 genes; P< 0.002, hypergeo- 
metric test). The occurrence of both motifs upstream of pbap2-g itself 
suggested the potential for an autoregulatory feedback mechanism, and 
the regions of the genome containing these motifs upstream of pbap2-g 
were both recognized by PbAP2-G in EMSA analysis (Fig. 2a). Expres- 
sion analysis demonstrated transcription of pbap2-g in blood-stage 
parasites; however, epitope tagging of full-length pbap2-g produced no 
detectable protein (Supplementary Fig. 6) yet gametocytogenesis was 
unaltered, implying that tagged PbAP2-G activity is unaffected. How- 
ever, a truncated cyan fluorescent protein (CFP)-tagged transgene product 
could be detected in nuclei of female gametocytes (Fig. 2c and Sup- 
plementary Fig. 7). 

Comparative microarray analyses showed that gametocyte-specific 
genes were highly enriched among the 500 most downregulated genes 
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Figure 1 | Identification of mutations in pbap2-g that account for the 
repeated spontaneous loss of commitment to gametocytogenesis. 

a, Gametocyte production during a year of continuous mechanical passage of 
P. berghei. Best-fit polynomial trend (thick) lines of gametocytaemia on 
individual weekly observations (thin lines). b, Open reading frames (ORF) 
(yellow) of pbap2-g (PBANKA_ 143750) and pbap2-g2 (PBANKA_ 103430) 
with point mutations in new GNP lines shown in a and the long-established line 
2.33. Predicted DBDs (light blue) and DBD recognition motifs for PhAP2-G 
upstream of each ORF (brown bars) are indicated. Dark blue arrows show 
integration sites for selectable marker cassettes as used for genetic 
complementation of GNPs (COMP-DOWN) or to disrupt the promoter 
(COMP-UP). Numbering is relative to position 1 of the ORF. c, FACS analyses 
of male and female gametocyte numbers (circled areas) expressed as a 
percentage of the total parasitized cell counts. From left, P. berghei ANKA HP 
line (which lacks green (GFP) or red (RFP) fluorescent protein reporters, thus 
having no fluorescent signal and from which all subsequent lines reported in 
this study were derived) served as a negative control. Line 820 is the reporter 
line from which GNP mutants and a targeted knockout (KO) (using vector 
PbGEM-072446) were derived. 820REP and GNPm7REP were generated with 
the COMP-DOWN complementation vector. d, Giemsa-stained gametocytes 
in GNP line 2.33 (G756) repaired by the COMP-DOWN construct and after a 


in GNP lines (P< 10 *', Fisher’s exact test), pbap2-deletion parasites 
(P<10 *) and in the pbap2-g2 deletion mutant (P < 10 *”), although 
less marked in the latter (Table 1 and Supplementary Fig. 6). Com- 
parison of the transcriptomes of wild-type asexual blood-stage parasites 
with those of various pbap2-g lines was performed in an attempt to 
identify early-transcribed genes downstream of and under control of 
PbAP2-G (Fig. 3a). The steady-state transcription levels of 307 genes 
were identified as being downregulated (>2 s.d. reduced from the mean, 
Supplementary Table 6) in schizonts. 

The activity of 18 promoters consistently downregulated in GNP lines, 
and which contain one or more candidate P»bAP2-G-binding motifs, 
was analysed in wild-type and GNPm9 parasite backgrounds. Male, 
female or sex-specific genes downstream of AP2-G in the gametocyte 
developmental pathway were identified (Fig. 3b, Supplementary Fig. 8 
and Supplementary Table 8). Single point mutations in PbAP2-G-binding 
motifs did not significantly reduce stage- or sex-specific expression of 
all ofa number of reporter genes in vivo, even ifidentical changes ablated 
DNA binding in vitro. Only larger promoter truncations produced an 
impact on expression (Supplementary Fig. 9). Therefore, the relatively 
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single transmission through mosquitoes. Scale bar, 6 jm. e, Gametocyte 
quantification from manual counting in Giemsa-stained blood smears of an 
independently produced pbap2-g deletion mutant before and after 
complementation (comp.) with the DS (downstream) vector and of two 
independent pbap2-g2 knockout mutants. Error bars show standard deviations 
from three replicates. The loss of gametocytes from the knockout mutants was 
significant (P< 0.05). f, Relative growth kinetics of GNPm9, pbap2-g and 
pbap2-g2” lines determined by flow cytometry. Left, cloned GNPm9 
constitutively expressing CFP (line GNPm9-CFP) was mixed in a 1:1 ratio with 
wild-type (PBANKA HP) producer line constitutively expressing RFP (line 
WT-REP). The daily percentage of the population expressing either RFP (red), 
CFP (blue) or both (purple; reflecting cells infected with multiple parasites) was 
calculated. Right four panels, deletion vectors for pbap2-g, pbap2-g2 or p28 
(control gene for neutral growth rate) were transfected in GFP- or mCherry- 
expressing lines (blue and red bars, respectively) and the relative abundance of 
each mutant determined in mixed infections of uncloned parasites. Error bars 
show + standard deviations from three biological replicates. The competitive 
advantage was significant for the pbap2-¢ (P<0.01) but not the pbap2-g2— 
parasites (two tailed Student’s t-test for change in relative abundance). RBC, red 
blood cell. 


simple and highly abundant PbAP2-G motif is only active in context 
and its presence not always indicative of a critical role for the activity 
of a particular promoter. The PbAP2-G motifs upstream of pbap2-g 
do appear to be important as gametocytogenesis is blocked when the 
allelic motifs are both deleted, supporting the concept that commit- 
ment to gametocytogenesis requires a positive feedback loop powered 
by PbAP2-G itself (Fig. 3c). 

The discovery of the ApiAP2 family* was the first identification of 
predicted transcription factors in apicomplexan genomes, otherwise 
thought to be remarkably lacking in genes encoding transcription factors’. 
The majority of ApiAP2 transcription factors are probably essential, 
involved in the progression of the intraerythrocytic asexual development 
of Plasmodium. Roles for additional ApiAP2 factors in the continuation 
of development of parasite forms associated with transmission have 
been demonstrated, namely for the ookinete (PbAP2-O°), sporozoite 
(PbAP2-S"°) and liver stages (PbAP2-L") of development. ApiAP2 proteins 
may also silence genes, possibly through maintenance of heterochromatin”. 
The AP2/ERF family members in Plasmodium are predicted to act singly 
or in combinations that control the continuation of the transcriptional 
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Figure 2 | Characterization of the DNA-binding specificity, expression and 
subcellular localization of PbAP2-G. a, Top, protein binding microarray 
determination of the DNA binding recognition preference of the recombinant 
DBD of PbAP2-G. GST, glutathione S-transferase. Bottom, EMSA in which a 
shift indicates whether the P»bAP2-G DBD binds to double-stranded DNA 
containing wild-type (W) or mutated (M) motifs (panels al-d1 and a2-d2, 
respectively) from the upstream regions of pbap2-g itself, pbap2-g2, and 
position —610 of the hypothetical gene spm1 (subpellicular microtubule 
protein 1, PBANKA_081070). b, Expression analysis by reverse-transcriptase 
(RT)-PCR of pbap2-g in targeted and spontaneous pbap2-g mutants and the 
wild-type control line, 820. The 1.15-kb product indicates lack of transcript 


programme of the Plasmodium life cycle**. Heritable gene-regulatory 
strategies include epigenetic marks, stable cytoplasmic factors and tran- 
scriptional autoregulatory circuits that can determine distinct cell fates’. 
In the latter, commitment to a specific developmental pathway (for 
example, gametocytogenesis) is probabilistic, its frequency being defined 
by the likelihood of the interaction of a fate-determining transcription 
factor with a critical promoter often triggering a positive autoregulatory 
feedback loop that commits the cell'*, a paradigm that has been invoked 
within the Plasmodium AP2 transcription factor network”. P. falciparum 
uses precise epigenetic control to influence the sub-nuclear location of 
pfap2-g'° and therefore possibly PfAP2-G binding which, when coupled 
to an autoregulatory positive feedback loop (Fig. 3c) involving PfAP2-G 
production, could provide flexible control of gametocytogenesis in a 


Table 1 | Changes in gene expression in mutants 


only in the targeted knockout line. Primer positions were as shown in the 
schematic. See Supplementary Fig. 7 for pbap2-g transgene expression data. 

n = 3. gDNA, genomic DNA. GO, Glasgow oligo. c, Localization of the pbap2-g 
minigene product to the nucleus of P. berghei female gametocytes. CFP was 
sandwiched between the N-terminal 300 base pairs (bp) and the C-terminal 
800 bp of pbap2-g, including the DBD, and expressed from 2 kb of the pbap2-g 
promoter in line 820. Expression was only detected in the nuclei of female 
gametocytes (>50 observations in three experiments). It is the C-terminal 
segment that determines the nuclear localization of Ph»AP2-G (Supplementary 
Fig. 8). Scale bar, 6 um. Cartoon is not to scale. DIC, differential interference 
contrast. 


manner that would also be amenable to environmental sensing'*"’. AP2- 
Gis, at present, unique within the apiAP2 transcription factor family 
in that it directs a change in developmental fate rather than merely 
progressing a lineage (Supplementary Fig. 10), distinguishing it from 
AP2-G2 and from a number of other genes required for gametocyte 
maturation”. This critical role of AP2-G is conserved in P. falciparum”, 
even though models for the timing of commitment in the two parasites 
differ”**!. Orthologues of the ap2-g DBD are present in all sequenced 
Apicomplexa, raising the possibility that mechanisms of commitment 
to sexual development may also be conserved (Supplementary Fig. 11). 
Thus these data identify the earliest known event in parasite transmis- 
sion. Because it occurs in the blood of the host it is amenable to and 
suggests novel control strategies largely through drug development and 


Gene ID Description Rank GNP P pbap2-g KO2 pbap2-g2 KO1 
051500 25-kDa ookinete surface antigen 1 —4,56 2.5 x10? -4.88 -L72 
051490 28-kDa ookinete surface antigen 2 -3.48 2.9 x10? —6.28 =2.37 
133370 Phosphodiesterase delta 125 —3.61 1.3 x10°2 —3.89 —1.32 
121910 Heat-shock protein 90 175 -3.34 7.6 X10-? —3.67 -1.93 
142170 Secreted ookinete protein, putative 62 =3.95 1.0 x10~+ =3.98 —1.42 
131950 LCCL domain-containing protein CCP2 64 —3.09 6.2 x10? -3.79 L131 
146300 Osmiophilic body protein 232 —1.63 1,2:x1077 —2.60 —0.27 
112040 Pfs77 homologue, putative 52 —2.68 3.4 x10°7 =3.50 —0.78 
134040 Oxidoreductase, putative 327 —-4,59 5.9 x10-2 —2.80 —1.77 
123130 Metabolite/drug transporter, putative 26 —3.31 5.0 x10~2 2,82 —1.46 


Gene expression was determined on Agilent microarrays for in vitro-cultured schizonts, comparing pooled GNP clones and targeted mutants to their parental control lines. Logs fold changes are shown for the top 
10 genes with good functional annotation that were most strongly deregulated in the targeted mutant pbap2-g KO1. Gene IDs are given without their PBANKA_ prefix. Rank refers to the absolute expression rank 
among 4,553 genes in purified gametocytes determined from three biological replicates. Expression data are means from three biological replicates for each mutant. P denotes the Pvalue adjusted for multiple 


testing. For the complete data and all P values see Supplementary Table 6. 
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Figure 3 | pbap2-g acts upstream of gametocyte gene transcription. 

a, Volcano plot of log, fold change in gene expression in schizonts of pbap2-g 
KO1 (whole ORF deletion) versus wild-type line 820 against significance of 
change (—log10 t-test). Red triangles indicate genes upregulated in gametocytes 
compared to schizonts. Black and yellow shapes are genes detailed in Table 1 
and Fig. 3c, respectively. b, Reporter-gene expression constructs were 
transfected into the GNPm9 and 820 control clones to confirm gametocyte- 
gene-specific promoters. Reporters contained 2 kb of upstream sequence from 
the indicated genes driving CFP expression with a constitutive 3’ untranslated 
region. Bar plots show CFP measured by flow cytometry over 3 days in the 
820 line. Life cycle stages (asexual, male and female) are separated on the basis 
of GFP or RFP expression. Mean of three measurements (geometric mean CFP 
fluorescence) + s.d.; *P < 0.05, ** P< 0.01, ***P < 0.001, two-tailed t-test. 
Flow cytometry plots are shown for CFP expression of reporters in 820 
(parental) (left) or GNPm9 (right) lines. Plots show GFP (x axis) versus RFP 
(y axis) expression for all infected red blood cells and CFP expression in 
magenta. Numbers on each plot represent the percentage of events within each 


offers some strategic value in the prevention of sexual development 
and reduction of transmission. 


METHODS SUMMARY 


P. berghei ANKA parasites were maintained in female Theiler’s original (TO) mice 
(6-8 weeks old) under appropriate Home Office licences. A fluorescent reporter 
line 820 (ref. 3) for male (green) and female (red) gametocytes was transmitted weekly 
by blood passage into a new host for up to 52 weeks in 10 parallel lines and game- 
tocytaemia assessed weekly by flow cytometry. Whole-genome sequencing was fol- 
lowed by de novo assembly and variant calling. Targeted gene knockouts were 
generated using traditional plasmids or PlasmoGEM vectors”. GNP phenotypes 
were confirmed by a variety of methods. Genetic complementation was by ends-out 
recombination over the region mutated in GNP clones and confirmed functionally 
by FACS and mosquito passage. A pbap2-g DBD-GST fusion protein was used in 
protein binding microarray analysis as described’. The purified GST-recombinant 
protein was used in EMSA assays with 60-mer biotinylated annealed oligonucleo- 
tides. Microarray analysis was performed on total RNA on an Agilent array” and 
data submitted to the Gene Expression Omnibus (GEO) database. Reporter constructs 
were transfected into 820 and GNPm9. Reporter expression was monitored by FACS 
over several days. The promoter of pbap2-g was modified by ends-out integration 
into 820 and gametocytaemia monitored over several days using flow cytometry. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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gate that are positive for CFP (see also Supplementary Fig. 8). c, Deletion 
studies in the pbap2-g promoter provide support for a role of P»AP2-G binding 
motifs in the positive feedback regulation of pbap2-g expression. Top, DNA 
constructs containing a selectable marker were integrated into the promoter 
region of pbap2-g in PBANKA 820. The constructs either deleted 207 bp 
surrounding the two instances of the PbAP2-G binding motif at the positions 
indicated (G901 and G902) or did not (G903 and G904). Two sites of selectable 
marker integration were tested, 2 and 3 kb upstream of the ORF of pbap2-g. 
In addition, interruption at — 1,288 upstream of the ORF of pbap2-g was shown 
to disrupt gametocytogenesis (Supplementary Fig. 4f). Control line G905 was 
transfected with a reporter construct targeted to the p230p locus and known 
not to affect gametocytogenesis. Bottom, gametocytaemia was measured on 
consecutive days by flow cytometry once the parasitaemia reached >1%. Mean 
+ s.d. shown, *P < 0.05 compared to 820 parental (two-tailed t-test). Data 
shown are pooled from 3 days’ observations and representative of three 
independent experiments. 
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METHODS 

Parasite lines and methods. P. berghei ANKA HP was obtained from C. Janse at 
Leiden University Medical Centre and was originally referred to as clone 15Cy1A 
(Leiden Malaria Group website). Line 820 was generated from HP. P. berghei 
ANKA clone 2.33 is a non-gametocyte-producing clone line reported in 1990 
and is now widely distributed and grown by mechanical passage™*. All infections 
were performed on female Theiler’s original (TO) mice (age 6-8 weeks; weight 25- 
30 g) according to Home Office licence regulations and the local ethical commit- 
tees. All animals were assigned to experiments without pre-selection and no blind 
assignations were performed. Serial passage of freshly cloned P. berghei reference 
line 820 mIcl1 (ref. 3) was performed as follows: 10 mice (m1-m10) were initially 
infected with 200 pl of a 1:200 dilution of a mouse infected with line 820 at a para- 
sitaemia of ~2%. In the absence of any a priori information concerning mutation 
rates in P. berghei a sample size of 10 was selected based on concerns of animal 
welfare, cost and logistics. Each week, the infections were passaged to a further 10 
mice in a similar manner when the parasitaemia was >1%. Parasitaemia and 
gametocytaemia were monitored by examination of Giemsa-stained blood films 
and by flow cytometry as described”. The infected blood from each mouse was 
also cryopreserved each week. Passage to a fresh mouse was halted when a line was 
negative for gametocyte production for 4 consecutive weeks and designated GNPmx, 
where x would be 1-10. The experiment was halted after 52 weeks. Lines GNPm7, 
GNPm8 and GNPm9 were cloned by limiting dilution, clones subjected to nega- 
tive selection*® to remove the selectable marker residual in the GFP:RFP selection 
cassette and cloned once more. Each parasite cloning procedure used 10 mice, and 
mice were infected by intravenous tail injection with an average of 1.5 parasites, 
which in our experience will give rise to 4 infected mice. Negative selection involved 
3 mice, the infections of which were assayed by PCR for completeness of selection. 
Lines generated in this way were designated m(7,8,9)mxClx, indicating the mouse 
and clone number identifiers from the negative selection process. In the main text 
these cloned negatively selected lines are simply referred to GNPm7, GNPm8 and 
GNPm9. 

Transfection of GFP- and RFP- expressing ‘wild-type’ parasites from the P. berghei 
line 820 with linearized targeting constructs, selection and cloning of the mutant 
parasites were performed according to procedures described previously*’. Genotypic 
analysis of transfected parasites was performed by Southern analysis of chromo- 
somes separated by field-inversion gel electrophoresis and using diagnostic PCR on 
genomic DNA. Details of the primers used for PCR are shown in Supplementary 
Table 9. Phenotype analysis of mutant parasites during blood-stage development, 
quantification of gametocyte production and ookinete development in vitro was 
performed using standard methods as described previously** *’. Mosquito-stage 
development was analysed in Anopheles stephensi mosquitoes using standard methods 
of mosquito infection, analysis of oocyst and sporozoite production and sporo- 
zoite infectivity to TO mice*. The capacity of wild-type and engineered parasites 
to infect mice by mosquito-interrupted feeding was determined by exposure of 
female TO mice (n = 2-4) to 40-50 mosquitoes at day 21 after the infectious blood 
meal. Infection was monitored by analysis of blood-stage infection in Giemsa- 
stained films of tail blood at day4 until day8 after infection. Infectivity was 
recorded as ‘wild type’ if mice developed a parasitaemia of 0.1-0.5% at day 4 after 
infection. For the 2.33 rescue experiment, images representative of >80 gameto- 
cytes at a parasitaemia of 8.2% and gametocytaemia of 5.4% are shown in Fig. 1d; 
similar results were seen on 3 consecutive days. 

DNA-sequencing. To sequence clones 2.33, 820, GNPm7, GNPm8 and GNPm9, 
libraries of 300-500-bp fragment length were generated following a PCR-free 
protocol**. The libraries were sequenced using an Illumina Genome Analyser II 
with the V4 chemistry. Summary of reads for each project including accession 
codes are given in Supplementary Table 3. Data are available at http://www.ebi.ac. 
uk/ena/data/view/ERP000253. 

Sequencing: de novo assembly. We generated a de novo assembly of reads from 
the 820 parental clone using with velvet** version 1.0.12 and the following para- 
meters: -exp_cov auto -min_contig_Igth 500 -cov_cutoff 10 -ins_length 350 -min_ 
pair_count 20. We obtained 417 supercontigs with an average length (N50) of 
240 kb. We processed the assembly as described in the post-assembly genome- 
improvement toolkit protocol’’. In short, scaffolds were ordered with ABACAS”*® 
against the P. berghei ANKA reference genomes (GeneDB, version July 2010). This 
resulted in 16 pseudomolecules (14 chromosomes and 2 plastids) and a ‘bin’ of 100 
contigs that could not be associated with a chromosome. Next, using scaffolds of at 
least 1 kb as a substrate, IMAGE” was used to close 469 (61%) of the 774 sequen- 
cing gaps. Single-base and indel errors were corrected using ICORN™. This cor- 
rected 1,067 single-base errors and 92 indels. 1,589 positions had heterozygous 
calls, which represented collapsed repeats, mostly in P. berghei interspersed repeat 
(bir) genes. Last, the annotation of the P. berghei ANKA reference genome was trans- 
ferred onto the improved P. berghei 820 assembly using RATT” (Assembly option). 


In total, 4,821 of the 4,938 gene models were transferred correctly. The assembly is 
available on ftp://ftp.sanger.ac.uk/pub/pathogens/Plasmodium/berghei/820/. 
Sequencing: variant calls. To call variants, SMALT (version 0.6.2, http://www. 
sanger.ac.uk/resources/software/smalt/, parameters: -r 0, -x, -y 0.8, -i 1000, and for 
index a k-mer size of 17 (-k) anda step size of 3 (-s)) was used to map reads against 
the generated 820 assembly. After generating bam files with the SAMtools pack- 
age”, variation was called with GATK™ (parameters -ploidy 1 -glm POOLBOTH - 
pnrm POOL). For the reads mapped onto the 820 assembly, the variation of each 
clone, and concordance with other clones was analysed using a PERL script. For 
the reads mapped onto the ANKA reference genome, the script ignored variants 
that were called in all m7-m9 clones as well as 820. The quality filter for a variant 
was 60. The pipeline for whole-genome sequencing and identification of single 
nucleotide polymorphisms is summarized in Supplementary Fig. 12. 

Variant calling in Plasmodium from re-sequencing data are inherently noisy, 
owing to false calls within repeats and low-complexity regions. Thus, 3 independ- 
ent clones were used to identify coincident site(s). Isolate-specific variation is 
catalogued in Supplementary Table 3 and the large proportion of heterozygous 
calls are highlighted (a manifestation of calling variants within repetitive and low- 
complexity regions). 

All data were generated using ad hoc scripts (available upon request). The variant 
(.vcf) files of the each isolate are available from ftp://ftp.sanger.ac.uk/pub/pathogens/ 
Plasmodium/berghei/820/vcf. 

Phylogenetic analysis. Data were generated from the results of a BLASTP search 
of EuPathDB Apicomplexa using the AP2 domain from PBANKA_143750 as the 
query. Significant hits were defined as those that covered at least 75% of the length 
of the query domain and had >50% conserved residues. Neighbour joining tree 
was generated in CLC Genomics Workbench (version 6.5.1) using the Jukes- 
Cantor protein distance measure. Values shown are for 1,000 bootstrap iterations. 
The tree is rooted using the most distant Arabidopsis thaliana DBD Q9MOI0.2. 
Recombinant protein production. N-terminal GST-fused extended ApiAP2 
DBDs (cloned into pGEX-4T1) from P. falciparum ap2-g (PFL_1085w) and P. berghei 
ap2-g (PBANKA_ 143750) were expressed in Rosetta (DE3) pLys S-competent cells 
with 0.2mM IPTG at 25°C and batch-purified using affinity chromatography 
(Glutathione HiCap Matrix slurry; Qiagen). The purity of protein was estimated 
by 10% SDS-PAGE and the eluted proteins were quantified with spectrophoto- 
metry by optical absorbance at 260 nm. The eluted protein yield was concentrated 
and buffer exchanged using Amicon Ultra-0.5 Centrifugal Filter Devices (30K 
device; Millipore). The properties of the DBD fusion proteins produced and used 
in this study are indicated in Supplementary Table 10. 

Protein binding microarray analysis. Protein binding microarray analyses were 
processed and analysed as described previously”. 

EMSAs. DNA binding of purified N-terminal GST fusions of AP2 domains of 
AP2-G of P. falciparum (PF3D7_1222600) and P. berghei (PBANKA_143750) to 
their cognate DNA sequences was analysed by EMSA. Single-stranded oligonu- 
cleotides containing the recognition motif flanked either by random nucleotides 
(same for all flanking sequences) or by the actual genome sequence (as they occur 
naturally in the 5’ upstream regions of potential AP2 target genes) and their cor- 
responding complementary oligonucleotides were synthesized and purchased from 
MWG Eurofins (Germany) as labelled (5’-biotinylated and HPLC purified) and 
unlabelled sequences. Complementary single-stranded oligonucleotides were annealed 
to create double-stranded probes and used for EMSA as labelled and unlabelled 
target probes for the DBD of AP2G. EMSAs were performed using the LightShift 
Chemiluminescent EMSA kit (Pierce). In brief, 2 ug of the purified GST fusion of 
PfAP2-G and PbAP2-G (in separate reactions) was pre-incubated with 0.02 pmol 
of the labelled probe in 20 il of the binding reaction containing binding buffer, 1 jg 
poly(dI-dC), 50% glycerol, 100 mM MgCl, 1% NP40 and 60 pg BSA at room 
temperature (22°C) for 10 min. The unlabelled probe (4 pmol; 200-fold excess 
to the labelled probe) was then added as a competitor and the reaction was incu- 
bated for further 20 min at room temperature. The reaction was fractionated using 
12% PAGE and transferred to a nylon membrane (Hybond) as per manufacturer’s 
instructions. Specific binding of the AP2 domain with the target motif was detected 
as an upward shift using the Chemiluminescence Nucleic Acid Detection Module 
(Pierce), as per the manufacturer’s instructions, and anti-GST antibodies. 
Southern blot analysis. Southern blot analysis from wild-type line 820 and three 
different pbap2-g length-variable knockouts was performed to show successful 
integration of the selectable marker cassette at the desired genetic locus. In brief, 
approximately 10 jg of Plasmodipur (EuroProxima)-filtered and purified genomic 
DNA from lines 820 (wild type), G401cl1 (complete ORF knockout), G418cl6c13 
(DBD knockout) and G529cl2 (partial ORF knockout bearing the GNPm7, 8 and 9 
mutations) was double-digested each with 7 kl of appropriate restriction enzyme 
(New England Biolabs) pairs at 37 °C for 4h with NEB Buffer 4. For comparison 
with the wild-type line (820), gDNA from wild type and G401cl1, wild type and 
G418cl6cl3, and wild type and G529cl2 was double-digested with the High-Fidelity 
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versions of NcoI and Spel, NcoI and BamHI, and EcoRI and Spel, respectively. 
After transfer the membrane was hybridized (60°C overnight) with P**-labelled 
single-stranded DNA probe for a specific region from one of the homology arms 
used for generating the gene targeting vector. The probes were PCR-amplified and 
purified using the following oligonucleotides: GU1058 and GU1059 for G401cll, 
GU1416 and GU1417 for G418cl6cl3 and GU1414 and GU1415 for G529cl2. The 
membrane was washed three times with decreasing concentration of SSC (3 SSC, 
1X SSC, 0.5 SSC) and exposed to a maximum-resolution X-ray film (BioMax 
MR film; Kodak) for 35h. 

Northern blot analysis. Approximately 5 ug of RNA sample for each line (except 
G529cl2; which was ~2 lg) was denatured and fractionated in 1.2% agarose gel in 
2.2 M (w/v) formaldehyde at 20 V overnight in 1X MOPS as running buffer. After 
transfer the RNA in the membrane was hybridized (60°C overnight) with P*?- 
labelled single-stranded DNA probe for p28 messenger RNA (PBANKA_051490; 
0.62 kb ORF) and normalized using hsp70 mRNA probe (PBANKA_071190; 2.08 kb 
ORE), washed and exposed to a maximum resolution X-ray film (BioMax MR film; 
Kodak). 

Recombineering methods. Gene knockout vectors for pbap2-g and pbap2-g2 
were submitted to the PlasmoGEM database as PoGEM-072446 and PbGEM- 
039238, respectively” where details of their construction can be found. Comple- 
mentation vectors were made using the Red recombination system of phage lambda 
using published protocols*. First, E. coli harbouring P. berghei gDNA clone PbG01- 
2472c01, which carries a >11-kb genomic insert including pbap2-g in the pJAZZ- 
OK linear plasmid (Lucigen), were rendered competent for recombination by 
transfection with plasmid pSC101gbaA”’. A marker cassette for positive and nega- 
tive selection in E. coli, attR1-zeo-pheS-attR2, was then amplified using primer 
pairs Comp143750UpR1/2 or Comp143750D1R1/2 (see Supplementary Table 11 
for primer sequences). The resulting PCR products carried 50-bp extensions 
homologous to the upstream or downstream intergenic regions of pbap2-g, respec- 
tively. The PCR products were introduced into the recombination-competent E. 
coli carrying the PbG01-2472c01 library plasmid and the recombination product 
selected with Zeocin. The bacterial marker was then exchanged for the P. berghei 
selection marker hdhfr-yfcu in an in vitro Gateway reaction, the product of which 
was retransformed into E. coli and negatively selected on YEG-Cl and kanamycin 
as described**. Clones carrying the correct complementation plasmid were iden- 
tified by PCR across the boundary of the hdhfr-yfcu cassette. Before transfection 
the constructs were linearized using NotI removing the plasmid backbone. 
Reporters: construct generation. The CFP reporter construct pG0148 was gen- 
erated by inserting CFP into pG073 as follows: CFP was amplified from pL1382 
using primers to incorporate Xhol and Smal restriction sites. This was cloned into 
the Xhol/Smal sites of pG073 (KH unpublished) between an hsp70 (PBANKA_071190) 
promoter (1.4kb) and p45/48 constitutive 3’ UTR. The plasmid also contains a 
negative-selection cassette” and target regions for DXO integration into a p230p 
locus downstream of the GFP/RFP cassette in the 820 line”. Candidates for 
reporter analysis in the first batch (rep 1-14) were chosen on the basis of fold 
downregulation in GNP versus 820 schizont, the presence of at least one predicted 
AP2 binding motif (GTACxC or GxGTAC or GGTACxC) and at least moderate 
expression levels in at least one life cycle stage. For some of the second batch of 
reporters based on analysis of trophozoite stage transcripts (rep 15-24) the addi- 
tional criteria of not predicted to be translationally repressed was included. 2 kb of 
sequence immediately upstream to the predicted translational start site (PlasmoDB) 
was amplified by PCR using Taq polymerase and primers incorporating KpnI/ 
Xhol restriction sites. pG0148 was digested with KpnI/Xhol to excise the hsp70 
promoter and new reporter promoters ligated in. To introduce mutations into the 
predicted AP2-G binding sites an overlapping PCR strategy was used to mutate 
the GTAC to GTAA. A primer designed around the site incorporating the muta- 
tion in both forward and reverse complement was used with the original forward 
and reverse primers for the 2 kb fragment in a two-stage overlapping PCR reac- 
tion. The fragment was cloned into pG0148 and sequenced to confirm the muta- 
tion. After verification of correct insert 15-30 1g of plasmid DNA was digested 
with SaclII to linearize the integration fragment and subsequently cut with either 
Scal or SapI to cut the plasmid backbone and minimise risk of introducing epi- 
somes. Fully digested DNA was ethanol precipitated and re-suspended in water 
before being mixed with 100 pil Nucleofector (Lonza Amaxa) solution for trans- 
fection into 820 and GNPm9 lines. 

Reporters: transfection. DNA prepared as above (4-12 1g per transfection) was 
mixed with Nycodenz-purified synchronous P. berghei schizont lines 820 or 
GNPm9 and electroporated using programme U33 of Amaxa machine. Para- 
sites were then immediately injected into the tail vein of a TO mouse. 24-28h 
after transfection the parasites were placed on positive selection by including 
pyrimethamine (Sigma) in drinking water*’. 

Reporters: flow cytometric analysis. Analysis was performed on parasites from 
tail blood on days 6-10 after transfection. 2 1] of tail blood was placed into 500 kl 
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rich PBS (Roche) with 20mM HEPES, 20mM glucose, 4mM NaHCOs, 0.1% 
BSA) containing 1 yl Vybrant DyeCycle Ruby (Invitrogen) and incubated at 37 °C 
for 30 min. Parasites were pelleted and re-suspended in 1.5 ml of FACS buffer (PBS 
(Roche) with 2mM HEPES, 2 mM glucose, 0.4 mM NaHCO;, 0.01% BSA, 2.5 mM 
EDTA). Analysis was performed ona CyAn ADP 9 colour flow cytometer (Beckman 
Coulter) equipped with 405-nm, 488-nm and 642-nm solid-state lasers and 500,000 
events were acquired (counting all events except debris). On each day an uninfected 
control and CFP-negative parental controls were processed in parallel with reporter 
lines. Data analysis was performed using Kaluza analysis software (Beckman Coulter) 
following the gating strategy indicated in the following schematic. For histogram 
analysis the CFP geometric mean expression level (AFU) in each gated population 
male, female and asexual was calculated as a mean from three day’s data and 
plotted as a bar chart in excel. 

Allevents were plotted as forward scatter (FS) versus side scatter (SS) and gate E 
drawn to exclude debris. Events in gate E were plotted on FS versus FS (area) and 
gate J(1) drawn to exclude potentially autofluorescent doublets and clumps. Events 
in gate J(1) were plotted FS versus Ruby (DNA stain) and gate G drawn to select 
infected cells. Gate G was drawn on the basis of a negative (uninfected) control 
population stained in the same way and analysed on the same occasion (Supplemen- 
tary Fig. 13a). 

Events in gate G were plotted SS versus CFP and a CFP positive gate drawn 
based on a non-CFP-expressing parental line (820, HP or GNP9) stained and pro- 
cessed on the same occasion and at similar parasitaemia. GFP versus RFP was 
plotted for all infected cells (events in G) and for only those falling into the CFP- 
positive gate. Gates drawn on female F (RFP-positive) and male M (GFP-positive) 
populations was used to calculate the percentage of each population that expresses 
CFP based on the number of cells in each gate in each plot. 

For illustrative figures the infected population (G) was plotted on GFP versus 
REP and those additionally falling into gate CFP-positive coloured magenta whereas 
those not CFP-positive were coloured grey. The percentage of the population 
within each gate expressing CFP (calculated as above) is indicated (Supplemen- 
tary Fig. 13b). 

Microscopy analysis. For some lines the CFP expression was analysed on a Zeiss 
Axioplan II fluorescent microscope. A drop of tail blood was stained with 5 uM 
Hoechst in enriched PBS for 10 min then placed on a microscope slide under a 
coverslip and sealed with nail varnish and visualized under a X 100 oil immersion 
objective, images were captured and processed using Volocity software. 
Methods for promoter interruption experiments. During attempts to rescue 
gametocytogenesis in GNP lines by complementation rescue techniques we had 
observed that an interruption to the pbap2-g promoter slightly downstream of two 
GxGTAC motifs led to a loss of gametocyte production. To investigate this further 
a series of constructs was made to target the pbap2-g endogenous promoter and 
mutate specifically in the region of these GxGTAC motifs. Effect on gametocyto- 
genesis after integration of these constructs into the endogenous AP2-G promoter 
in the fluorescent 820 parental line could then be monitored using flow cytometry. 
Promoter interruption construct generation and transfection. A double-cross- 
over homologous recombination method was used to create targeted interruptions 
of the pbap2-g endogenous promoter. The plasmid pL0035 was used, which con- 
tains a selection cassette including human DHER driven by the pbeeflaa promoter 
surrounded by multiple cloning sites. Genomic fragments from the pbap2-g pro- 
moter region were amplified by PCR from wild type genomic DNA using Kapa Hi- 
Fi polymerase (KapaBiosystems) and cloned in piecewise as described below to 
allow for flexibility with the vector for creating multiple mutations. The 207-bp 
region containing the GxGTAC motifs was synthesized by MWG-Biotech with or 
without point mutations in the core motif. All regions are described by their 
distance from the pbap2-g gene start. A downstream integration fragment from 
bp-416 to bp-1,277 was cloned in using SmaI and EcoRI and an upstream integ- 
ration region from bp-2,695 to bp-1,912 cloned in using HindIII and SaclI. The 
region from bp-1,913 to bp-1,484 was cloned downstream of the selection cassette 
and in front of the downstream integration region using KpnI and EcoRV to create 
vector pG266 (2-kb deletion). Using Smal and EcoRV the synthesized region from 
—1,913 to —1,484, either wild type or containing single point mutations in the 
G.GTAC motif, was cloned into vector pG266 to create pG298 (2-kb WT) or 
pG312 (2 kb MutA). Additionally a clone containing the wild-type 200-bp region 
in reverse orientation was selected pG299 (2-kb WT Rev). Subsequently the Smal 
cloning site in pG298 was removed to created pG313 (2-kb WT-Sma). To extend 
the region of endogenous promoter remaining between the selection cassette and 
the pbap2-g gene an additional fragment from —2,870 to — 1,913 was cloned into 
the KpnI site downstream of the selection cassette in pG266 and pG313 to create 
pG266+3 (3-kb del) and pG313+3 (3-kb WT). Constructs were linearized using 
HindIII and EcoRI, and approximately 10 jg of purified linear DNA was trans- 
fected in to P. berghei parasites (820 line) as described elsewhere. 
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Promoter interruption gametocytogenesis essays. Gametocyte levels in trans- 
fected parasites were monitored by flow cytometry (on a FACS CyAN, Beckman 
Coulter) on a drop of tail blood from animals containing the transfected parasites and 
maintained on pyrimethamine selection throughout from 6 days post-transfection 
for up to 5 consecutive days. Parasites were passaged into a clean animal main- 
tained on pyrimethamine selection and gametocytaemia followed. As the back- 
ground gametocyte levels measurable using our methods in the parental 820 line 
varied from ~3 to ~20% depending on parasitaemia and unknown factors, a 
control transfection was carried out to enable gametocyte levels to be monitored 
in a line that had been maintained under exactly the same conditions. This was 
usually the plasmid pG306, which integrated to the p230p locus and contains a CFP 
gene driven by the PBANKA_101870 promoter. This also enabled us to confirm 
general transfection efficiency in each batch of transfections. After gating on the 
infected population using DyeCycle Ruby staining, the percentage of parasites 
expressing RFP (female) or GFP (male) parasites was calculated. Results shown 
are the total gametocytaemia (male and female) as a percentage of the parasite 
population and a mean = s.d. from three readings from passaged animals. The 820 
parental line is a mean from four readings. 

Minigene construction and analysis. pG0148 was generated as previously described 
in reporters section. To generate pG0157 a 2-kb fragment immediately upstream of 
the pbap2-g gene was amplified using primers to incorporate KpnI and Xhol restric- 
tion sites and cloned in place of the hsp70 promoter in pG0148. To generate 
pG0189 a 300-bp fragment of pbap2-g was amplified to incorporate Xhol restric- 
tion sites and was cloned in frame with CFP into the Xhol restriction site between 
the hsp70 promoter and the CFP gene in pG0148. To generate pG0190, CFP was 
amplified from pL1382 using primers to exclude the stop codon of CFP and 
incorporate XhoI and Smal restriction sites. This was cloned into pG073 to gen- 
erate pG0188 (not shown). A 900-bp C-terminal fragment of pbap2-g incorporating 
the DBD was amplified from gDNA using primers to incorporate Smal restric- 
tion sites and cloned into the Smal restriction site downstream of and in-frame 
with CFP in pG0188. To generate pG0191 the pbap2-g promoter and first 300 bp 
of coding sequence were amplified using primers incorporating KpnI and Xhol 
restriction sites and was cloned in place of the hsp70 promoter in pG0190. Plas- 
mids were sequenced and 5-10 1g of linearized purified DNA transfected into 
either 820 or GNPm9 lines as previously described for reporter genes. Resulting 
transfected parasites were analysed by flow cytometry and fluorescence microscopy 
for expression and localization of CFP signal. Each experiment was performed 
independently three times. 

Competitive growth assays. GNPm9M1Cl1 was transfected with construct pG0148 
to constitutively express CFP from an hsp70 promoter to generate line GNP-CFP. 
An analogous construct with RFP driven by the hsp70 promoter was generated 
(pG0161) and transfected into wild-type (HP) producer line to generate WT-REP. 
Also generated was a wild-type (HP) producer line expressing CFP from construct 
pG0148 (WT-CEP). Each line was individually grown in a TO mouse under pyri- 
methamine selection. 2 1] tail blood from each mouse was stained with Vybrant 
Dyecycle Ruby (Invitrogen) to label infected red blood cells and then run on a 
CyAn ADP 9 Colour flow cytometer (Beckman coulter). After gating on infected 
cells the CFP or RFP expression was analysed showing that nearly 100% of each 
population after gating for infected cells expressed the fluorescent marker. Para- 
sites were mixed to create a 50:50 mix of parasites containing either WT-CFP and 
WT-REP or GNP-CFP and WT-REP. These were injected intravenously into 
mice. Parasites were monitored daily by flow cytometry and after gating for infected 
cells the percentage of the population expressing either RFP (gate AF — +), CFP 
(gate AF + —) or both (gate AF+ +) reflecting mixed-multiply infected cells was 
calculated and plotted. On day 6, blood from each mouse was passaged into a new 
host and the time course continued. After day 11 parasites were cryopreserved. 
For the competition assays between the pbap2-g KO1, pbap2-g2 KO and p28 KO, 
the PlasmoGEM knockout vectors were transfected into the GFP- and mCherry- 
expressing parasites. Once the parasitaemia in transfected animals reached ~5%, 
they were used to generate an inoculum containing an equal proportion of red and 
green parasites. Accuracy of each inoculum was tested using flow cytometry. New 
mice were injected (1 X 10° parasites per animal) and kept under continued pyri- 
methamine treatment to prevent the emergence of untransfected parasites. The 
proportion of red and green parasites in the mixture was followed daily using flow 
cytometry. Three infected mice were used for each comparison. 

Microarray methods. A 8X 15k custom microarray (Agilent) providing coverage 
of the P. berghei genome at >1 probe per kb of coding sequence was used”’. Samples 
were prepared from parasites maintained using standard parasitological proce- 
dures. For schizont cultures parasites were obtained from cardiac puncture and 
grown overnight in culture. For ring-stage cultures parasites were matured in vitro 
to schizont stage in order to synchronise the population, then injected into a new 
host and allowed to reinvade. Blood was collected at 24 + 6h post infection and 
filtered through a magnetic column (variomacsD) to deplete of mature stages and 


gametocytes. For trophozoite-stage parasites, parasites were prepared as for ring 
stages were then cultured for a further 6h. All samples were filtered through a 
Plasmodipur filter to remove mouse leucocyte contamination before RNA pre- 
paration using a standard TRIzol method. Samples were processed for microarray 
using methods as described”’. For GNP and pbap2-g KO1 a two-colour micro- 
array hybridization was performed with a background pool of complementary 
DNA made from material from all life cycle stages (except late mosquito and liver 
stages). Parental control lines and experimental samples were then hybridized 
with the same background pool sample for all experiments. For pbap2-g KO2 and 
pbap2-g KO, the mutant samples were hybridized against the equivalent samples 
from the parental line and against each other. Arrays were scanned on an Agilent 
Microarray Scanner. Normalized intensities were then extracted using Agilent 
Feature Extractor. All expression data are available from the Gene Expression 
Omnibus database (http//:www.ncbi.nih.gov/geo) under the accession numbers 
GSE52859 and GSE53246. 

Statistical methods for microarrays. Three biological replicates were performed 
for each life cycle stage of pbap2-g KO1 line and the 820 parental line. Naturally 
derived GNP line (schizonts only) microarray results are representative of two 
technical replicates each from three independently derived GNP lines. These tech- 
nical replicates were performed in different laboratories using the same methods. 
The pbap2-g KO1 and GNP microarray data was uploaded to PUMADB (http:// 
puma.princeton.edu/) for further processing. The data was extracted as a log, of 
the fold change of red (sample) versus green (common pool) with minimal filter- 
ing to exclude background signal and median centred. The fold change between 
the GNP sample and the 820 parental line was calculated for each transcript and 
the mean and standard deviation of the replicates calculated (using Microsoft Excel). 
The distribution of these samples was confirmed to be normal (P < 2.2 X 102°, 
Kolmogorov-Smirnov test in R version 2.10), and the transcripts classed as down 
regulated in GNP lines were those 2 s.d. below the mean fold change. For plotting 
volcano plots (Fig. 3) a two-tailed t-test was performed on the independent repli- 
cates and a -log 9 transform of this result plotted. This was plotted against the log, 
fold change using R ggplot2 library. To determine which transcripts were gametocyte- 
specific the fold change between three replicates of gametocyte-stage wild-type 
parasites was compared to three replicates of schizont-stage wild-type parasites. A 
one-tailed t-test was then used to determine those upregulated in gametocytes as 
highlighted in volcano plots in Fig. 3. For pbap2-g KO2 and pbap2-g2 the bio- 
logical triplicates of each of the hybridizations (both mutants against the wild type 
and against each other) were processed using the R version 2.15.0 software** with 
limma package“’. The data was background-corrected and normalized between 
the arrays (LOESS normalization). Fold changes between the strains and P values 
for differential expression were calculated with a linear statistical model. The P values 
from all experiments were adjusted using the false discovery rate correction. 

For the gametocyte expression rank (Fig. 3a and Supplementary Table 6) the 
absolute intensity values from microarrays from three independent replicates of 
wild-type gametocytes was used and ranked from highest (1) to lowest (~4,553) 
expression rank. To test for the deregulation of the gametocyte-specific genes in all 
the strains, the enrichment in gametocyte-specific genes (expression rank 1 to 500) 
in the top 500 genes showing the highest fold change in each of the mutants was 
tested using the Fisher’s exact test. Comparisons of the variances of the microarray 
data were carried out in R and all the variances were similar; none of the samples 
were significantly different (P< 10 X 10~'*, F-test). Microarray data has been 
submitted to the GEO database (accession numbers: GSE52859 and GSE53246). 
Search for DNA-binding motifs. The genomic sequences for all P. berghei genes 
were identified using PlasmoDB (version 9.1) and defined as a 2-kb region upstream 
of the transcription start site to the first base of the transcription start site (4,803 
entries). A file was also created for the gametocyte-specific genes (452 entries). 
Differences in usable entries were due to genes close to the ends of chromosomes 
or poorly assembled regions, and regions that overlapped other genes. A custom Perl 
script was used to count occurrences of the PbAP2-G and PbAP2-G2 motifs in the 
sequences using a regular expression (PbAP2-G was defined as /GxGTAC|GTACxC/ 
and PbAP2-G2 was defined by orthology as /TGCxACC|GGTxGCA/; ref. 6) The 
script counts the occurrence of each pattern per-region and also provides a total 
number of sequences that contain at least one occurrence, and is available on request. 
Hypergeometric P values were calculated interactively using R version 2.10. 
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RNA interference is widely distributed in eukaryotes and has a vari- 
ety of functions, including antiviral defence and gene regulation’”. 
All RNA interference pathways use small single-stranded RNA (ssRNA) 
molecules that guide proteins of the Argonaute (Ago) family to com- 
plementary ssRNA targets: RNA-guided RNA interference’”. The 
role of prokaryotic Ago variants has remained elusive, although bio- 
informatics analysis has suggested their involvement in host defence’. 
Here we demonstrate that Ago of the bacterium Thermus thermo- 
philus (TtAgo) acts as a barrier for the uptake and propagation of 
foreign DNA. In vivo, TtAgo is loaded with 5’-phosphorylated DNA 
guides, 13-25 nucleotides in length, that are mostly plasmid derived 
and have a strong bias for a 5’-end deoxycytidine. These small inter- 
fering DNAs guide TtAgo to cleave complementary DNA strands. 
Hence, despite structural homology to its eukaryotic counterparts, 
TtAgo functions in host defence by DNA-guided DNA interference. 

To elucidate the physiological role of Ago in prokaryotes, we studied 
Ago from T. thermophilus. Comparison of the ago genes of the type strain 
HB27 (refs 4, 5) and a derivative with enhanced competence (HB27*°; 
Fig. 1a and Extended Data Fig. 1a), revealed that an insertion sequence 
(ISTth7)° disrupts ago in HB27*°. In line with a role of TtAgo in 
reducing competence, a generated Aago mutant (HB27Aago; Fig. 1a) 
has a natural transformation efficiency that is a factor of ten higher 
than the wild-type HB27 (P < 0.02, Fig. 1b). Complementation of the 
knockout strain with ago (HB27Aago::‘ago (HB27Aago complemented 
with a strep(II)-tag-ago gene fusion insert); Fig. 1a, b) almost comple- 
tely restores the wild-type phenotype. Moreover, isolation of plasmid 
and total DNA from the wild-type and the ago knockout strains revealed 
lower plasmid yields from the wild-type strain, indicating that TtAgo 
reduces the intracellular plasmid concentration (P< 0.02, Fig. 1c; 
P<0.02, Fig. 1d). 

We performed transcriptome analysis of HB27 and HB27Aago to 
determine whether TtAgo-mediated interference proceeds directly by 
targeting plasmid DNA, or indirectly by regulating gene expression. 
Although the comparison revealed pleiotropic changes in gene expres- 
sion (Extended Data Fig. 2), we did not observe substantial differential 
expression of genes involved in plasmid uptake or host defence (Extended 
Data Table 1). Hence, RNA sequencing (RNA-seq) analysis suggests 
that TtAgo does not influence plasmid uptake and plasmid copy number 
at the level of transcriptional control. 

We therefore studied whether TtAgo interacts with plasmid DNA. 
In agreement with the RNA-seq analysis (Extended Data Fig. 2), affinity- 
purified TtAgo expressed from the chromosome of HB27Aago::*ago 
could be detected by protein mass spectrometry (Extended Data Table 2). 
Unfortunately, molecular analysis of TtAgo expressed in T. thermophilus 
was hampered by the low TtAgo yield, and attempts to overexpress 
TtAgo in T. thermophilus from a plasmid were unsuccessful. By con- 
trast, expression of Strep(II)-tagged TtAgo (Fig. 2a) in Escherichia coli 
was successful when performed at 20 °C. Under these conditions, TtAgo 


has no effect on plasmid content (Extended Data Fig. 1b). Analysis of 
co-purified nucleic acids revealed that TtAgo-associated RNA (10-150 
nucleotides) is preferentially *’P-labelled in a polynucleotide kinase 
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Figure 1 | TtAgo interferes with plasmid DNA. a, Overview of ago gene loci 


of T. thermophilus strains: HB27 (wild type), HB27"° (spontaneous derivative 
with enhanced competence), HB27Aago (knockout), and HB27Aago::*ago 
(HB27Aago complemented with a strep(II)-tag-ago gene fusion insert). Kan®, 
kanamycin resistance marker. b, Transformation efficiency of T. thermophilus 
strains on transformation with the plasmid pMHPngosGFP (Extended Data 
Table 5). Error bars indicate standard deviations of biological duplicates. 

c, Yield of pMHPngosGFP plasmid mini preparation (miniprep) of HB27 and 
HB27Aago. Error bars indicate standard deviations of biological triplicates. 

d, Plasmid content of total DNA purified from HB27Aago relative to that from 
HB27, as quantified by Genetools (Syngene) after resolving the DNA on a 0.8% 
agarose gel. Error bars indicate standard deviations of biological triplicates. 
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(PNK) forward reaction, indicating the presence of 5’ hydroxyl groups 
(Extended Data Fig. 1c). By contrast, co-purified DNA has a more 
defined length (13-25 nucleotides), and is preferentially labelled in a 
PNK exchange reaction, indicating phosphorylated 5’ ends (Fig. 2b). A 
5' phosphate group is a general feature of Ago guides”""'. 

Whereas eukaryotic Ago proteins exclusively use ssRNA guides, some 
prokaryotic Ago proteins have a higher affinity for single-stranded DNA 
(ssDNA) guides””®. Moreover, the characteristics of the small DNAs that 
associate with TtAgo in vivo are in agreement with previously described 
in vitro guide requirements*'*'’. TtAgo catalyses cleavage of ssDNA 
targets in vitro when supplied with complementary 5’-phosphorylated 
21-nucleotide ssDNA guides, but not when supplied with analogous 
ssRNA guides*’*’? (Extended Data Fig. 3). During isolation of an active 
site double mutant, TtAgoDM (TtAgo(D478A,D546A); Fig. 2a), only 
RNAs co-purify (10-150 nucleotides; Extended Data Fig. 1c). This 
suggests that active site residues are involved in processing and/or 
binding of the ssDNA molecules. 

Cloning and sequencing of TtAgo-bound DNA molecules resulted 
in 70.6 million sequences, of which 65% can be mapped on the TtAgo 
expression plasmid pWUR702, 3% on the plasmid pRARE, and 32% 
on the chromosome of E. coli K12 (Extended Data Table 3). Remarkably, 
when normalized for the DNA content in each cell, TtAgo predomi- 
nantly co-purifies with guides complementary to pWUR702 and 
pRARE (approximately 54 and 8.8 times more frequently, respect- 
ively), rather than with guides complementary to the E. coli K12 chro- 
mosome (Extended Data Table 3). 

More detailed analysis of unique guide sequences revealed two popu- 
lations of DNA guides: one 15-nucleotides long, and the other ranging 
from 13 to 25 nucleotides in length (Fig. 2c). No obvious bias towards 
specific regions of the plasmids or the chromosome was detected: the 
guides target coding and non-coding regions on both strands independ- 
ent of GC content (Fig. 2e). Some guides map on one of the plasmids as 


well as on the chromosome of E. coli (for example, on lacI and proL). 
The fact that these guides do not seem to be under-represented com- 
pared with other plasmid-targeting guides indicates that there is no 
selection against chromosome-targeting guides, but rather that the 
differential guide loading (Extended Data Table 3) is a result of pref- 
erential acquisition of guides from plasmids. 

Interestingly, 89% of the DNA guides have a deoxycytidine (dC) at 
the first position at the 5’ end and 72% have a deoxyadenosine (dA) at 
the second position (Fig. 2d). Despite this bias, identical TtAgo cleav- 
age activities are observed with DNA guides containing a 5’ dC, dT, dA 
or dG (Extended Data Fig. 4a-d). The 5’ dC preference may result 
from specific guide processing, or from preferential 5’ nucleoside selec- 
tion by TtAgo. A bias for specific 5’ nucleosides also occurs in certain 
eukaryotic Ago proteins'*”. 

We performed activity assays to investigate whether the in vivo 
plasmid-derived ssDNAs are functional guides that enable TtAgo to 
cleave double-stranded DNA (dsDNA) targets (expression plasmid 
pWUR702). Purified TtAgo linearizes or nicks pWUR702, resulting 
in linear or open circular plasmid DNA, respectively (Fig. 3a, lane 4), 
whereas TtAgoDM does not show this activity (Fig. 3a, lane 3). The 
cleavage activity of TtAgo is strongly temperature dependent: whereas 
ssDNA is cleaved at temperatures =20 °C, plasmid DNA is only cleaved 
at temperatures =65 °C (Extended Data Fig. 4e, f). This agrees with the 
observation that during TtAgo expression in E. coli at 20 °C, plasmid 
concentrations are not decreased (Extended Data Fig. 1b). Purified 
TtAgo is unable to cleave plasmids that have no sequence similarity 
to pWUR702 or pRARE (for example, pWUR708; Fig. 3b, lane 4). 
However, when supplied with two synthetic 5’-phosphorylated ssDNA 
guides that target both strands of the plasmid at the same locus (Fig. 4b), 
TtAgo was able to linearize or nick pWUR708 (Fig. 3b, lane 8). These 
findings, together with the guide sequence data, indicate that the in vivo 
acquired DNA molecules guide TtAgo to cleave dsDNA targets. We 
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Figure 3 | TtAgo cleaves plasmids complementary to its guides. 

a, b, Untreated target plasmid (lane 1, 5), plasmid incubated at 75 °C in the 
absence of proteins (lane 2, 6), or in the presence of TtAgoDM (lane 3, 7) or 
TtAgo (lane 4, 8) purified from E. coli, resolved on 0.8% agarose gels. LIN, 
linear; M1, 1 kb Generuler marker (Fermentas); M2, linearized and untreated 
target plasmid; OC, open circular; SC, supercoiled plasmid. a, TtAgo expression 
vector pWUR702. b, Target plasmid pWUR708, which shares no sequence 
identity with expression vector pWUR702 or pRARE. Additionally, synthetic 
(Syn.) ssDNA guides were added to the reactions with pWUR708 (lane 5-8). 


propose to refer to these guides of TtAgo as small interfering DNAs 
(siDNAs). 

To gain insight into the molecular mechanism of dsDNA cleavage 
by TtAgo, we performed additional in vitro plasmid cleavage assays 
using purified TtAgo loaded with synthetic siDNAs. Negatively super- 
coiled plasmids (isolated from E. coli) were used, because at least 95% 
of all plasmids isolated from T. thermophilus have a negatively super- 
coiled topology'*””. Negative supercoiling facilitates melting of the DNA 
duplex, especially at elevated temperatures'*°. Target plasmids pWUR704 
and pWUR705 are identical except for the flanking regions of the target 
site (AT-rich or GC-rich; Fig. 4a). Both plasmids share no sequence 
similarity with TtAgo expression plasmid pWUR702, and they are not 
cleaved by TtAgo unless complementary siDNAs are added (Fig. 4c). 
When supplied with a single 21-nucleotide siDNA, TtAgo nicks the 
negatively supercoiled plasmid (Fig. 4c, lanes 3, 4), and when supplied 
with a mixture of two 21-nucleotide siDNAs that target both DNA 
strands at the same locus, TtAgo linearizes the plasmid (Fig. 4b, c, lane 5). 
Both nicking and dsDNA cleavage are more efficient when the target 
sequence is flanked by AT-rich regions (Fig. 4a, c and Extended Data 
Fig. 5a, b). Interestingly, the same TtAgo-siDNA complexes are not 
able to cleave linearized plasmids (Extended Data Fig. 5c, d). This 
suggests that cleavage of dsDNA by TtAgo depends on the negatively 
supercoiled topology of the target DNA. 

Subsequent analysis revealed that the TtAgo-siDNA complex is 
able to linearize a relaxed, nicked plasmid if its target site is directly 
opposite the first nick (Extended Data Fig. 5e). If the nicked site is 
located further away (33 bp) from the target site, linearization of the 
nicked plasmid occurs only if the target region is AT-rich (Extended 
Data Fig. 5f, g). Thus, although the negatively supercoiled topology of 
the plasmid is lost after the primary nick, the nick facilitates local melting 
of the dsDNA (especially in AT-rich DNA), which allows TtAgo-siDNA 
complexes to nick the second strand, resulting in a dsDNA break. Like 
eukaryotic Ago proteins”, the TtAgo-siDNA complex cleaves a phos- 
phate ester bond between the target nucleotides that base pair with 
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Figure 4 | TtAgo cleaves plasmids by nicking two strands. a, Plasmids 
pWUR704 and pWUR705 contain a 98 bp target region with a GC content of 
17% or 59%, respectively, as indicated in blue (for details, see Extended Data 
Fig. 5a, b). b, Part of the pWUR704 and pWUR705 target site (indicated in blue) 
and complementary ssDNA guides used in this experiment (indicated in red). 
Black triangles indicate predicted cleavage sites. c, 0.8% agarose gels loaded with 
pWUR704 and pWUR705 plasmids that were incubated without proteins 
(lane 1), or with TtAgo (lane 2), TtAgo-forward (FW) guide complex (lane 3), 
TtAgo-reverse (RV) guide complex (lane 4), or TtAgo-FW and TtAgo-RV 
guide complexes. LIN, linear; M1, open circular and linear pWUR704 or 
PWUR705; M2, 1 kb Generuler marker (Fermentas); OC, open circular; SC, 
supercoiled plasmid. 


guide nucleotides 10 and 11 (ref. 22). Sequence analysis of a cleaved 
dsDNA target (Extended Data Fig. 5h) demonstrated that dsDNA 
breaks also result from nicking both strands at the canonical Ago 
cleavage site. 

While this manuscript was under revision, a characterization of a 
prokaryotic Ago protein from Rhodobacter sphaeroides (RsAgo) was 
published’. Despite similarities in the overall domain architecture of 
TtAgo and RsAgo, there are major functional differences between these 
proteins. RsAgo acquires mRNA-derived RNA guides with a 5’ uridine 
(U), whereas TtAgo acquires DNA guides with a 5’ dC. In both proteins, 
guides complementary to plasmids are over-represented. However, RsAgo 
lacks a functional catalytic site and functions by target-binding alone. 
TtAgo, on the other hand, harbours a functional catalytic site allowing 
cleavage of both single- and double-stranded targets. 

On the basis of our findings, we propose a model for DNA interfer- 
ence by TtAgo. On the entry of plasmid DNA into the cell, TtAgo 
acquires siDNA guides (13-25 nucleotides in length) from the invader. 
Although the mechanism of guide acquisition by TtAgo is unknown, 
the requirement of an intact catalytic site suggests involvement of the 
nuclease itself. TtAgo is loaded with siDNAs that are preferentially 
derived from plasmids; as such, single guides may allow for neutra- 
lization of multi-copy invaders. Combining our in vivo and in vitro 
data, we speculate that TtAgo uses siDNA guides to specifically cleave 
ssDNA targets, such as DNA taken up by the natural competence system” 
or replication intermediates. The siDNA-TtAgo complex also targets 
negatively supercoiled dsDNA, which results in plasmid nicking. Espe- 
cially in the case of plasmid DNA, single-strand breaks will result in 
loss of the supercoiled topology and, as such, in decreased transcription 
levels”*. Furthermore, if the nick site is located in an AT-rich region, 
TtAgo loaded with an siDNA that targets the opposite strand may generate 
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a dsDNA break, potentially leading to degradation of the plasmid by 
other nucleases. The observation that invading DNA elements generally 
have a lower GC content than their hosts** may explain self/non-self 
discrimination by TtAgo. Whereas the eukaryotic Ago protein is a key 
component of sophisticated multi-enzyme systems for RNA-guided 
RNA interference, we reveal the biochemical activity and functional 
importance of an evolutionarily related enzyme in prokaryotes that 
protects its host against mobile genetic elements through DNA-guided 
DNA interference. 


METHODS SUMMARY 


T. thermophilus HB27, HB27°°, and two derivatives of the HB27 strain, HB27Aago 
and HB27Aago::‘ago, were used for plasmid transformation experiments. Plasmid 
pMHPnqosGFP was isolated from HB27 and HB27Aago. RNA for RNA-seq analysis 
was purified from HB27 and HB27Aago during mid-log-phase growth. Strep(II)- 
tagged TtAgo was heterologously produced from a plasmid in E. coli KRX (Promega) 
and purified by affinity purification before analyses of co-purified nucleic acids. 
Guides co-purified with TtAgo or synthetic guides were used in in vitro TtAgo 
cleavage assays using synthetic ssDNA or dsDNA plasmid as targets. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature13125 


Corrigendum: Crystal structure of a 
nitrate/nitrite exchanger 


Hongjin Zheng, Goragot Wisedchaisri & Tamir Gonen 


Nature 497, 647-651 (2013); doi:10.1038/nature12139 


While revising our manuscript prior to formal acceptance, we acci- 
dentally removed two references from the main text. The footnote to 
Table 1 of this Letter should cite ref. 1 (not ref. 24) for footnote symbol 
+ and ref. 2 (not ref. 21) for footnote symbol }. We apologise for any 
inconvenience. 


1. Unkles, S. E. et al. Two perfectly conserved arginine residues are required for 
substrate binding in a high-affinity nitrate transporter. Proc. Natl Acad. Sci. USA 
101, 17549-17554 (2004). 

2. Unkles, S. E. et al. Alanine scanning mutagenesis of a high-affinity nitrate 
transporter highlights the requirement for glycine and asparagine residues in the 
two nitrate signature motifs. Biochem. J. 447, 35-42 (2012). 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature13126 


Corrigendum: An environmental 
bacterial taxon with a large and 


distinct metabolic repertoire 


Micheal C. Wilson, Tetsushi Mori, Christian Rtickert, 
Agustinus R. Uria, Maximilian J. Helf, Kentaro Takada, 
Christine Gernert, Ursula A. E. Steffens, Nina Heycke, 
Susanne Schmitt, Christian Rinke, Eric J. N. Helfrich, 
Alexander O. Brachmann, Cristian Gurgui, 

Toshiyuki Wakimoto, Matthias Kracht, Max Criisemann, 

Ute Hentschel, Ikuro Abe, Shigeki Matsunaga, Jorn Kalinowski, 
Haruko Takeyama & Jorn Piel 


Nature 506, 58-62 (2014); doi:10.1038/nature12959 


One of the accession numbers for this Article was listed as AZHXW01000000 
instead of AZHX01000000. It has been corrected in the online versions 
of the paper. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature13143 


Corrigendum: Mesoangioblast stem 
cells ameliorate muscle function in 
dystrophic dogs 

Maurilio Sampaolesi, Stephane Blot, Giuseppe D’ Antona, 
Nicolas Granger, Rossana Tonlorenzi, Anna Innocenzi, 

Paolo Mognol, Jean-Lauren Thibaud, Beatriz G. Galvez, 

Ines Barthelemy, Laura Perani, Sara Mantero, Maria Guttinger, 
Orietta Pansarasa, Chiara Rinaldi, M. Gabriella Cusella De 


Angelis, Yvan Torrente, Claudio Bordignon, Roberto Bottinelli 
& Giulio Cossu 


Nature 444, 574-579 (2006), doi:10.1038/nature05282 and 
corrigendum Nature 494, 506 (2013); doi:10.1038/nature11976 


In Fig. 4b of this Article, the gel for the loading control MyHC for the 
dog Varus was an unintentional duplication of the loading controls 
for the dog Vampire (which is correct). The correct gel is shown below 
in Fig. 1. The error does not affect any of our results and a University 
College London (UCL) committee appointed by the Vice Provost for 
Research has investigated the data presented in this Article and is 
satisfied of its authenticity. 


Correspondence should be addressed to G.C. (giulio.cossu@manchester.ac.uk). 


————— 


Figure 1 | This is the corrected lower-left panel of Fig. 4b of the original 
Article. 


Varus 
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CAREERS 


TURNING POINT Behavioural researcher aims 
to transform funding models p.265 


CAREER PROGRESSION Childcare breaks 
reduce wages for female doctors p.265 


NATUREJOBS For the latest career 
listings and advice www.naturejobs.com 


A fracking site in Williston, North Dakota. The state has suffered a housing shortage as a result of thousands of workers flocking to join the oil boom. 


Fracking fundamentals 


Scientists in the United States who are looking to ride the gas- exploration boom can find a 
variety of options for employment, from chemical research to environmental monitoring. 


BY SID PERKINS 


he practice of hydrofracturing (com- 
| monly called ‘fracking’) is booming in 
North America. The United States is the 
world’s largest producer of shale gas, second 
only to Canada, and US shale-gas production 
increased by tenfold between 2006 and 2013. 
And despite concerns about the sustainability 
of fracking (see J. D. Hughes Nature 494, 307- 
308; 2013) and its environmental impact, job 
opportunities in the industry — many of them 
science-related — are flourishing. 

Fracking involves pumping large amounts of 
chemical-laden water and sand into subterra- 
nean shale formations to shatter rock and then 
prop open the resulting fissures, which frees 
up the oil and natural gas entombed there. The 
increase in hydrofracturing is driving a need 
for field geologists and petroleum engineers, 


as well as opening up job prospects for a wide 
variety of scientists, including chemists and 
environmental engineers. Many of the posts 
are related to the need to treat, recycle or dis- 
pose of the millions of litres of wastewater that 
a hydrofractured well can generate. 


WELLSPRING OF OPPORTUNITY 

The oil-and-gas industry consists of a wide 
range of companies, all the way from major 
producers (such as BP in London and Exxon- 
Mobil in Irving, Texas) and the subcontractors 
that provide services to them (such as Halli- 
burton in Houston, Texas) down to consult- 
ing firms. Hence, the job opportunities are 
widespread and varied, with different compa- 
nies often requiring different sets of technical 
skills and levels of experience. Most hire peo- 
ple holding bachelor’s degrees and then train 
them in-house, says Michael Webber, deputy 


director of the Energy Institute at the Univer- 
sity of Texas in Austin. But there are also plenty 
of slots for applicants with advanced degrees. 

Preliminary figures from the American 
Geosciences Institute (AGI) in Alexandria, 
Virginia, show that about 75% of last year’s 
US graduates in geology and geophysics went 
into the oil-and-gas industry. Furthermore, 
about 46% of those earning master’s degrees 
and about 33% of those gaining PhDs in the 
United States also headed for the sector. This 
is “a very fundamental change’, says Christo- 
pher Keane, the AGI’s director of technology 
and communications; three years ago, the 
AGI reported that only some 10% of recently 
minted PhDs went into the private sector. This 
move towards industry may stem in part from 
a relatively limited academic market. 

Salaries in the oil-and-gas industry, includ- 
ing the fracking sector, are attractive > 


13 MARCH 2014 | VOL 507 | NATURE | 263 


© 2014 Macmillan Publishers Limited. All rights reserved 


> compared with most starting academic 
positions. According to the US Bureau of Labor 
Statistics, the median income of US geoscien- 
tists was just under US$91,000 in 2012. And 
the Bureau predicts that the number of geosci- 
entist positions will leap by 16% (an increase of 
about 6,000 posts) by 2022, a full five percent- 
age points higher than the average job growth 
in the United States during the same period. 

North America is currently the hotbed for 
fracking-related jobs for scientists, but oil- 
and-gas-rich shales elsewhere will be tapped 
at increasing rates over the coming decades. 
For now, shale-gas production in Europe is 
almost zero but is expected to rise to nearly 85 
billion cubic metres by 2040. Likewise, China 
is expected to produce 141.5 billion cubic 
metres of shale gas by 2040, making up 50% of 
the country’s natural-gas production. 


PLUGGING THE GAPS 
Poll data underscore the requirement for par- 
ticular technical skills related to fracking. A 
survey of oil-and-gas industry professionals 
by the Society of Petroleum Engineers, based 
in Richardson, Texas (J. Petrol. Technol. 65, 
82-85; 2013), identified a need for people with 
skills in the recycling, disposal and treatment 
of wastewater. 

Keane notes that prospective employers cite 
two potential skills gaps in particular among 
new recruits: a lack of quantitative skills (such 
as expertise in fluid dynamics) anda lack of field 
experience (only 40% of recent graduates had 
attended at least one 6-week-long field camp, 
the equivalent of an internship). Filling these 
gaps would boost a job candidate's desirability. 
Although gaining quantitative skills is fairly 
standard in a geosci- 
ence or petroleum- 
engineering degree, 
getting pre-graduation 
field experience is a bit 
harder, says J. Foster 
Sawyer, an explora- 
tion geologist at South 
Dakota School of 
Mines & Technology 
in Rapid City. Most 
of the schools offering 
petroleum-engineer- 


ing degrees reserve “People with a 
their field camps keen knowledge 
for students at their of rock . 
schools, so those seek- mechanics and 
ing to maximize their petrophysics 
chancestosecurethis areinvery short 
credential should supply right 
consider attending now.” 
these programmes. Scott Tinker 

People with a keen 


knowledge of rock mechanics (how rocks 
respond to force) and petrophysics (how rocks 
and fluids interact) “are in very short supply 
right now’, says Scott Tinker, a subsurface 
geologist and associate dean at the University 


Fracking in Greene County, Pennsylvania. 


of Texas at Austin. Before returning to aca- 
demia 15 years ago, he spent 17 years in the 
oil-and-gas industry gaining such skills — 
interpreting rock, seismic and borehole sen- 
sor data to search for and develop oil- and 
gas-rich deposits, and then using those data to 
build three-dimensional models of oil and gas 
reservoirs. Such experience is mostly gained 
through on-the-job training, although some 
institutions offer classes in such areas. Inter- 
ested scientists should carefully investigate 
programmes to ensure that they are offering 
marketable skills. 


PROBLEMS CREATE OPPORTUNITIES 
The problems associated with fracking waste- 
water can be attacked on several fronts, opening 
up niches for a range of scientists. The search 
is on for alternatives to the current cocktail of 
chemicals that is added to water used for frack- 
ing: these chemicals (which are often noxious 
by themselves) have ample opportunity to 
react with each other in the hot, high-pressure 
environment deep within the well, spawning 
potentially even more unpleasant by-products. 
And the potential for pollution will increase: 
although about 34% of today’s US natural gas 
production comes from fracking, that fraction 
will rise to 50% in 2040, according to the US 
Energy Information Administration. 
Chemists could play a major part in reduc- 
ing wastewater problems, says David Alleman 
of ALL Consulting in Tulsa, Oklahoma. For one 
thing, he notes, researchers — whether in the oil- 
and-gas industry or in academia — are trying 
to mitigate environmental impacts by design- 
ing greener chemicals that either degrade more 
quickly or are less toxic. Moreover, chemists are 
looking to design blends in which the ingredi- 
ents do not react detrimentally with each other 
within wells. “There’s a lot of room for ‘down- 
hole’ chemists to figure that out,” says Webber. 
There is also a demand for civil engineers 
tasked with projects such as designing better 
surface ponds for storing wastewater. That is 
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because much of the risk from fracking comes 
from leakage into aquifers, and this occurs 
mostly from the surface rather than from wells 
below ground. Environmental engineers and 
wastewater-treatment specialists could also 
use their expertise to alleviate problems after 
a well has been fracked — either by treating 
the water or by developing no- or low-water 
technologies. 

Petroleum engineer Mukul Sharma heads 
a research group at the University of Texas at 
Austin that is trying, among other things, to 
develop alternative fracking fluids. In Sharma's 
team, which includes about 27 graduate and 
five undergraduate students, most are pursu- 
ing degrees in chemical, mechanical or civil 
engineering. But the team also boasts students 
pursuing degrees in applied maths, geology or 
geophysics. “This is a very interdisciplinary 
problem, so it requires people who have a wide 
variety of backgrounds,’ he says. 

Sharma and his colleagues, both within the 
industry and in academia, face a tough chal- 
lenge. Possible alternatives to chemical-laden 
water include foams based on nitrogen or 
carbon dioxide. Such fluids would reduce the 
volume of waste generated during the fracking 
process because the gas could be removed from 
the foam after use. But a downside might be the 
need to inject chemicals to break up the thick 
foam (potentially creating chemical waste of 
a different sort). Issues such as these provide 
ample research opportunities. 


RETIRING TYPE 

The 16% rise in the number of geoscientist 
jobs in the United States by 2022 doesn’t take 
into account the sizeable number of positions 
that will open owing to retirements or attri- 
tion. According to the most recently published 
AGI data, about 12% of the geoscientists 
working in 2011 are expected to retire by 2018. 

The retirements will mean a big loss of 
technical knowledge in the federal and state 
regulatory and safety agencies, says Keane, 
so expert environmental engineers will be 
needed to monitor air quality and chemical 
use. For people joining the field, he notes 
that “it’s going to be a tough transition’, but 
early-career scientists will be well positioned 
to quickly advance into managerial positions, 
and experienced scientists also have cause 
for optimism. “People who have been in the 
field five to ten years will have incredible 
opportunities.” 

Whether mid-career scientists or freshly 
minted graduates, geoscientists interested in 
the oil-and-gas industry have plenty of options 
— and in some cases better prospects than in 
academia for well-paying posts with advance- 
ment potential. “Right now, the job market is 
strong, and the future for young geoscientists 
is very bright,” says Sawyer. = 


Sid Perkins is a freelance writer based in 
Crossville, Tennessee. 


DANIEL J. SOEDER 


GREGORY DANIEL 


TURNING POINT 


Johan Bollen 


Johan Bollen caused a stir in January when 
he and his colleagues proposed an alternative 
science-funding model (J. Bollen et al. EMBO 
Rep. http://doi.org/f2pz34; 2014). Bollen, 

an informatician at Indiana University 
Bloomington, explains how the proposal 
developed, and how the idea of resource 
allocation became part of his research agenda. 


What got you thinking about funding models? 
A lot of people are unhappy with the current 
system. When you submit a proposal, you 
are like a contractor, but science does not 
work like that — it works best by generating 
ideas and gifting them to society and other 
scientists. 


How did your idea take shape? 

Some friends and colleagues had a Christmas 
party in 2012, and as soon as alcohol started 
to flow, so did commiseration. Guests talked 
about reviewer comments on proposals, mar- 
velling that one person can have that much 
power. The disgruntlement is a by-product 
of how the review system works. I started by 
saying, “Why not just take all that money and 
distribute it evenly?”. The goal was to see if we 
could, with as little administration as possible, 
distribute funding so that researchers have the 
freedom to explore the topics that they think 
matter most. 


Briefly, what is your plan for science funding? 
All scientists would receive a base amount 
— for example, US$100,000, which roughly 
corresponds to the US National Science Foun- 
dation’s 2010 budget divided by the number 
of senior researchers funded that year. Each 
scientist would be required to distribute a pre- 
determined percentage of their funding to the 
researchers whom they believed would make 
best use of the money. 


How did the proposal evolve? 

A big concern emerged: some scientists who 
do not deserve funding will get it. But, we 
thought, what if every scientist had to distrib- 
ute some of their funding to others on the basis 
of their track records? The more we thought 
about it, the more viable it seemed. 


What kind of feedback have you had? 

The feedback has been mostly positive, but the 
proposal is generally regarded as too crazy to 
work. The main critique is that this is a form 
of collusion: giving money to a colleague 
sounds like nepotism. But it would be easy 
to have conflict-of-interest rules. We could, 


. 


for example, use funding databases to see if 
donors were former advisers or at the same 
institution as recipients. The problem is that 
the system has no top-down control, which 
doesn't work for some people. 


How do you respond to critics who say that 
your proposal is anti-peer review? 

Peer review is a valuable tool, but funding 
panels can be costly and have wildly different 
outcomes. Reviewing one-project proposals is 
not the best way to allocate funding — I think 
we should fund people rather than projects. 


Your research models public mood using 
social media. Are you modelling the 
response to your proposal? 

Not scientifically. I have been on Twitter, 
mostly to answer questions. It sounds callous, 
but I do not care if people like the proposal. I 
want them to reconsider their allegiance to the 
existing system. 


Will you continue to push the concept? 
Absolutely. My colleagues and I are talking to 
funders to see if we can run some experiments, 
including ones with actual funding being dis- 
tributed and ones involving social choice and 
funding in selected communities. 


How has the idea of innovative resource 
allocation bled into your research? 

I have become enamoured with the idea that 
society could allocate resources by crowd- 
sourcing rather than assembling panels of 
experts. I plan to focus more on how resource- 
allocation algorithms could be applied to soci- 
etal problems such as poverty alleviation. The 
decisions of the few may not always be better 
than the decisions of the many. = 
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DEMOGRAPHICS 
UK science workforce 


Ethnic-minority workers are most highly 
represented in the most senior and junior 
positions of the UK scientific workforce, 
says a 7 March report. A Picture of the 

UK Scientific Workforce, from the Royal 
Society in London, examines gender, 
ethnicity and other factors, and is the 
most comprehensive analysis of its type, 
says Julia Higgins, chair of the report's 
steering group. The report finds that black 
researchers are slightly under-represented 
in the most senior roles, whereas 

scientists from China are statistically 
over-represented in those positions. It also 
finds that although women comprise just 
over half of the scientific workforce, they 
account for less than one-quarter of those 
in the highest-level positions. 


GENDER 


Skewed rankings 


Female full professors are less likely than 
men to co-author papers with assistant 
professors of the same sex, finds a study 
(J. EF Benenson et al. Curr, Biol. 24, R190- 
R191; 2014). Study authors calculated 
the expected co-author combinations 

for papers published from 2008 to 2011 
by psychologists at 50 US and Canadian 
universities. They found 14 pairings of 
senior and junior women, compared 
with the expected 29, and 76 pairings of 
senior and junior men, compared with the 
expected 61. Women's tendency to pair 
with another woman of the same rank 
impedes their academic mobility, says 
co-author Joyce Benenson of Emmanuel 
College in Boston, Massachusetts. 


CAREER PROGRESSION 
Costs of childcare 


Career interruptions for childcare cost 
female physicians earning power, says a 
German study (A. Evers and M. Sieverding 
Psychol. Women Q. 38, 93-106; 2014). 

The authors surveyed medical students 

in 1989, asking in part about attitudes 
towards medical school. A poll of the same 
cohort 15 years later revealed that earnings 
correlate with career absences, not with 
the respondents’ earlier outlook. Some 
87% of the 47 female respondents reported 
absences of an average of 1.8 years, mostly 
for childcare; just under two-thirds of the 
52 men reported absences of an average of 
7.2 months, mainly for non-employment. 
Roughly 90% of men were earning 

more than €36,000 (US$49,440) a year, 
compared with 55% of women. 
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A KITE FOR SARAH 


BY DAVID G. BLAKE 


CC hat’s it like when 
they shut one down, 
Papa?” 


Sarah watched me, her emerald 
eyes cut with a purity I refused to 
blemish. I lied to her instead. “I 
dont know, sweetie.” 

“T bet it’s a lot like being a kite 
that’s lost and floating high in the 
sky” 

The truth veered more towards 
being the one punished for los- 
ing the kite, yet I spared her again. 
“That sounds wonderful.” 

“Tl ask Mother. She knows 
everything” 

Nothing could be further from 
what I wanted. Her mother would 
grind that beautiful innocence 
into a nub of ugly truth. I calmed 
myself by imagining I really was 
like a lost kite soaring high. It felt 
so... so free. 

“You know she'll be tired when 
she gets home from work. Put on your 
pyjamas, brush your teeth and go to bed. We 
can talk tomorrow.’ 

Her bottom lip jutted in an exaggerated 
pout and she stomped away; but it was 
not in her nature to stay mad for long. She 
poked her head out from the bathroom a few 
moments later. “I love you, Papa,” she said, 
her smile an aureole of smeared toothpaste 
and happiness. 

“Tlove you, sweetheart. A truth that mag- 
nified my suffering a hundredfold and made 
it somehow bearable at the same time. 


Elizabeth stormed in around eleven. 

I blocked the stairs and mustered what I 
hoped would not prove to be the last scrap 
of defiance left in me. “They shut her teacher 
off today. Right in front of the class.” 

Her green eyes — so much like Sarah’, yet 
so different — thinned. She tugged off her 
gloves one finger after another and tossed 
them onto the counter top. “She questioned 
you?” 

My answering nod felt heavy, laden with 
betrayal. 

“And how did you respond?” 

“Tied.” 

She slapped me. Softly. Hard would have 
shown a measure of respect instead of cold 
indignation. “Don't be impertinent” 

“I told her I didn't know.” 
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In search of freedom. 


“That'll have to do, I suppose. I'll make the 
necessary arrangements in the morning.” 

“Please, Elizabeth, I don’t think she is 
ready.’ 

“Are you sure it is Sarah who is not ready?” 
She held up her hand. “Don't bother giving 
an answer. It’s not your place to think. You're 
to do as encoded until your usefulness has 
run its course. You remember what happens 
after that, don't you?” 

“But sh —” 

“You ll never replace Peter. Never. I don't 
care how much you look like him. Now get 
out of my sight. Your face disgusts me.” 

I snuck upstairs and watched Sarah sleep. 
Several times throughout the night, I almost 
woke her and confessed everything, but I 
could not so easily relinquish what little time 
we had left together. I also loathed the idea of 
my little girl becoming like her mother, and 
the quicker she learned the truth, the sooner 
that would be. 


“Are we truly going to fly a kite, Papa?” 

I nodded, afraid my voice would break 
to match my heart if I tried to speak; I had 
planned the day knowing it would be our last 
spent together. 


She clutched the NATURE.COM 
kite — just a simple — FollowFutures: 
red one shaped like © @NatureFutures 
a teardrop — with 4 go.nature.com/mtoodm 
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unrepressed enthusiasm. I marvelled x 
at how such a thing could make her = 
so utterly happy. 

We walked to the field behind 
our house, and I showed her how 
to make the kite fly. She ran back 
and forth, gaze fixed on the whirl- 
ing red teardrop. I joined her, arms 
spread wide as if I might also catch 
an updraft and soar away; it was a 
nice thought while it lasted. She 
laughed and spread her arms like 
mine, eyes reflecting the life of the 
summer grass. The kite circled above 
us with purpose, tethered only for so 
long as Sarah's grip remained true; in 
such we were more alike than I had 
imagined. 

The sun scampered across the 
sky towards the frowning horizon. 
I stopped running in circles and 
twirled her around instead, losing 
myself in her laughter a little while 
longer. Just like that, our last day 
together started to end. “We should 
head back,” I said. “Your mother’s 
going to be home soon.” I knew Elizabeth 
would not be late, not on this night. 

“Do we have to, Papa?” 

“Tm afraid so, honey.’ Such a truth made 
nothing bearable. “Want me to show you how 
to spool the kite?” 

“No. I want to let it go free.” 

Free. What a powerful word. To hear it 
uttered by her, even once, was more than I 
could have hoped. I had decided — while I 
planned our last day, or perhaps later when 
I spun her in my arms and lost myself in her 
laughter — to tell her the truth; risk even 
more suffering, so that she would under- 
stand. But this was how I wanted her to 
remember me. 

“Go ahead.” It was nice to know I still had 
a crumb of defiance left in me. “Set it free” 

Her hands opened like a summertime 
bloom. The kite rode the wind into the dark- 
ening sky, our similarities ending in one last 
flicker of red. “Will it go to heaven, Papa?” 

I brushed away my tears before she could 
spot them. “I am sure it will float somewhere 
nice, sweetheart.” m 


David G. Blake lives in Pennsylvania with 
his girlfriend and their chocolate Labrador. 
In addition to Nature, his work has appeared 
in Beneath Ceaseless Skies, Daily Science 
Fiction and several other publications. For 
more info, visit his Facebook page. 


